理解非礼性暹罗人代表性学习中的消逝 (Understanding Collapse in Non-Contrastive Siamese Representation Learning)

Contrastive methods have led a recent surge in the performance of self-supervised representation learning (SSL). Recent methods like BYOL or SimSiam purportedly distill these contrastive methods down to their essence, removing bells and whistles, including the negative examples, that do not contribute to downstream performance. These "non-contrastive" methods work surprisingly well without using negatives even though the global minimum lies at trivial collapse. We empirically analyze these non-contrastive methods and find that SimSiam is extraordinarily sensitive to dataset and model size. In particular, SimSiam representations undergo partial dimensional collapse if the model is too small relative to the dataset size. We propose a metric to measure the degree of this collapse and show that it can be used to forecast the downstream task performance without any fine-tuning or labels. We further analyze architectural design choices and their effect on the downstream performance. Finally, we demonstrate that shifting to a continual learning setting acts as a regularizer and prevents collapse, and a hybrid between continual and multi-epoch training can improve linear probe accuracy by as many as 18 percentage points using ResNet-18 on ImageNet. Our project page is at https://alexanderli.com/noncontrastive-ssl/.

翻译：最近的一些方法,例如 BYOL 或 SimSiam 将这些反比方法提炼到其本质,除去钟声和哨声,包括负面例子,这些方法对下游业绩没有帮助。这些“非反比”方法在不使用负作用的情况下效果极好,尽管全球最低要求处于微不足道的崩溃状态。我们从经验上分析这些非反比方法,发现SimSiam对数据集和模型大小特别敏感。特别是,如果模型与数据集大小相比太小,SimSiam 演示会发生部分的尺寸崩溃。我们提出了衡量这种崩溃程度的尺度,并表明可以用来预测下游任务业绩,而不作任何微调或标签。我们进一步分析建筑设计选择及其对下游业绩的影响。最后,我们证明,转向持续学习的设置会起到调节作用,防止崩溃,而连续和多层次培训之间的混合作用可以提高线性探测的准确度,因为模型与数据集相比太小。我们在 ALSNet/Resalvarial Net 的18 % 。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【理解计算机视觉损失函数】《Understanding Loss Functions in Computer Vision!》by Sowmya Yellapragad

专知会员服务

44+阅读 · 2020年3月4日