Data mixing (e.g., Mixup, Cutmix, ResizeMix) is an essential component for advancing recognition models. In this paper, we focus on studying its effectiveness in the self-supervised setting. By noticing the mixed images that share the same source images are intrinsically related to each other, we hereby propose SDMP, short for $\textbf{S}$imple $\textbf{D}$ata $\textbf{M}$ixing $\textbf{P}$rior, to capture this straightforward yet essential prior, and position such mixed images as additional $\textbf{positive pairs}$ to facilitate self-supervised representation learning. Our experiments verify that the proposed SDMP enables data mixing to help a set of self-supervised learning frameworks (e.g., MoCo) achieve better accuracy and out-of-distribution robustness. More notably, our SDMP is the first method that successfully leverages data mixing to improve (rather than hurt) the performance of Vision Transformers in the self-supervised setting. Code is publicly available at https://github.com/OliverRensu/SDMP
翻译:数据混合( 例如, Mixup、 Cutmix、 重新缩放混合) 是推进识别模型的基本组成部分 。 在本文中, 我们侧重于在自我监督环境下研究其有效性。 通过注意共享相同来源图像的混合图像, 我们在此建议SDMP, 简称为$\ textbf{S}$\prode $\ textbbf{M}}$ta $\ textbf{M} $\ textbf{P} rior, 以捕捉这个直截了当的但至关重要的先前图像, 并定位混合图像, 如额外的 $\ textbf{ 阳性配对, 以方便自我监督的演示学习。 我们的实验证实, 拟议的 SDMP 能够将数据混合, 帮助一组自我监督的学习框架( 例如, MoCo) 实现更好的准确性和超分配性强性。 更显著的是, 我们的SDMP 是第一个成功利用数据混合( 而不是伤害) 改进自我监督的设置中愿景变换器的功能的功能的功能 。 。 代码可在 http:// gliver/ SDU/ SDU/ SDMP 。