Barlow Twins and VICReg are self-supervised representation learning models that use regularizers to decorrelate features. Although they work as well as conventional representation learning models, their training can be computationally demanding if the dimension of projected representations is high; as these regularizers are defined in terms of individual elements of a cross-correlation or covariance matrix, computing the loss for $d$-dimensional projected representations of $n$ samples takes $O(n d^2)$ time. In this paper, we propose a relaxed version of decorrelating regularizers that can be computed in $O(n d\log d)$ time by the fast Fourier transform. We also propose an inexpensive trick to mitigate the undesirable local minima that develop with the relaxation. Models learning representations using the proposed regularizers show comparable accuracy to existing models in downstream tasks, whereas the training requires less memory and is faster when $d$ is large.
翻译:Barlow Twins 和 VIRCReg 是自我监督的代表制学习模式,使用正规化者来改造特征。虽然它们与传统的典型化学习模式一样有效,但如果预测的表述方式规模很大,它们的培训在计算上也会要求很高;由于这些正规化者的定义是交叉关系或千差万别矩阵的个别要素,因此计算以美元为单位的以美元为单位的预测表达方式的损失需要花费时间。在本文中,我们建议采用宽松的规范化者版本,在快速四级变换时,可以用美元(n d\log d) 来计算。我们还建议一种廉价的伎俩,以缓解随着放松而发展的不受欢迎的本地微型。模型使用拟议正规化者进行的学习演示在下游任务中表现出与现有模型相似的准确性,而培训则需要较少记忆,在美元大时速度更快。