Self-supervised learning (SSL) is rapidly closing the gap with supervised methods on large computer vision benchmarks. A successful approach to SSL is to learn representations which are invariant to distortions of the input sample. However, a recurring issue with this approach is the existence of trivial constant representations. Most current methods avoid such collapsed solutions by careful implementation details. We propose an objective function that naturally avoids such collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distorted versions of a sample, and making it as close to the identity matrix as possible. This causes the representation vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors. The method is called Barlow Twins, owing to neuroscientist H. Barlow's redundancy-reduction principle applied to a pair of identical networks. Barlow Twins does not require large batches nor asymmetry between the network twins such as a predictor network, gradient stopping, or a moving average on the weight updates. It allows the use of very high-dimensional output vectors. Barlow Twins outperforms previous methods on ImageNet for semi-supervised classification in the low-data regime, and is on par with current state of the art for ImageNet classification with a linear classifier head, and for transfer tasks of classification and object detection.
翻译:自我监督学习(SSL)正在迅速缩小与大型计算机视觉基准监督方法的差距。 SSL的成功方法就是学习对输入样本的扭曲变化无异的表达方式。 但是,这个方法反复出现的问题是存在微不足道的常态表达方式。 目前的方法大多通过仔细执行细节避免了这种崩溃的解决方案。 我们建议了一个客观功能,通过测量两个相同网络产出之间的交叉关系矩阵来自然避免这种崩溃,这两个相同的网络的产出中含有扭曲的样本版本,使其尽可能接近身份矩阵。这导致一个样本的扭曲版本的表达矢量相似,同时最大限度地减少这些矢量各组成部分之间的冗余。这个方法被称为 Barlow Twins, 原因是神经科学家H. Barlow 的冗余减少原则适用于一对相同的网络。 Barlow 双胞并不需要大批量或双胞胎之间的不对称,例如预测网络、 梯度停止或重量更新的移动平均数。 它允许使用非常高的输出矢量矢量矢量,同时最大限度地减少这些矢量的样本。 由于神经科学家 H. Barlow Tews 超越了这些矢量的模前端网络, 和图像分类系统前的图像- 的升级系统的升级系统是用于半级的图像- 和图像- 的高级分类的图像- 的图像- 。