理解非学习过程中的崩溃 (Understanding Collapse in Non-Contrastive Learning)

Contrastive methods have led a recent surge in the performance of self-supervised representation learning (SSL). Recent methods like BYOL or SimSiam purportedly distill these contrastive methods down to their essence, removing bells and whistles, including the negative examples, that do not contribute to downstream performance. These "non-contrastive" methods work surprisingly well without using negatives even though the global minimum lies at trivial collapse. We empirically analyze these non-contrastive methods and find that SimSiam is extraordinarily sensitive to dataset and model size. In particular, SimSiam representations undergo partial dimensional collapse if the model is too small relative to the dataset size. We propose a metric to measure the degree of this collapse and show that it can be used to forecast the downstream task performance without any fine-tuning or labels. We further analyze architectural design choices and their effect on the downstream performance. Finally, we demonstrate that shifting to a continual learning setting acts as a regularizer and prevents collapse, and a hybrid between continual and multi-epoch training can improve linear probe accuracy by as many as 18 percentage points using ResNet-18 on ImageNet.

翻译：最近的一些方法,例如 BYOL 或 SimSiam 将这些反比方法提炼到其精髓,去除钟声和哨声,包括负面例子,这些方法对下游的性能没有帮助。这些“非竞争性”方法在使用负效果的情况下效果极好,尽管全球最低要求处于微不足道的崩溃状态。我们从经验上分析这些非竞争性方法,发现SimSiam对数据集和模型大小特别敏感。特别是,如果模型与数据集大小相比太小,SimSiam 表示会发生局部的尺寸崩溃。我们提出了衡量这种崩溃程度的尺度,并表明可以用来在没有微调或标签的情况下预测下游任务性能。我们进一步分析建筑设计选择及其对下游性能的影响。最后,我们证明,转向持续学习的设置,作为常规,防止崩溃,以及连续和多角度训练之间的混合,可以提高线性探测的准确度,因为许多人使用ResNet 18 图像网络的18 个百分点。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日