Self-supervised learning (SSL) learns useful representations from unlabelled data by training networks to be invariant to pairs of augmented versions of the same input. Non-contrastive methods avoid collapse either by directly regularizing the covariance matrix of network outputs or through asymmetric loss architectures, two seemingly unrelated approaches. Here, by building on DirectPred, we lay out a theoretical framework that reconciles these two views. We derive analytical expressions for the representational learning dynamics in linear networks. By expressing them in the eigenspace of the embedding covariance matrix, where the solutions decouple, we reveal the mechanism and conditions that provide implicit variance regularization. These insights allow us to formulate a new isotropic loss function that equalizes eigenvalue contribution and renders learning more robust. Finally, we show empirically that our findings translate to nonlinear networks trained on CIFAR-10 and STL-10.
翻译:自我监督的学习(SSL)通过培训网络从未贴标签的数据中学习有用的表达方式,通过培训网络从未贴标签的数据中学习有用的表达方式,从而对同一投入的扩大版本产生不同的作用。非争议方法避免崩溃,要么直接规范网络产出的共变矩阵,要么通过两种看似不相干的方法,即不对称的损失结构。在这里,我们以直接预测为基础,提出了一个调和这两种观点的理论框架。我们在线性网络中获取代表学习动态的分析表达方式。我们通过在嵌入的共变异矩阵的空格中表达这些表达方式,在其中,我们揭示了提供隐含差异规范的机制和条件。这些洞见使我们能够形成一种新的等值损失的等值功能,并使学习更加有力。最后,我们从经验上表明,我们的调查结果转化成在CIFAR-10和STL-10上受过训练的非线性网络。