Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results. We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure that reflects the level of correlation among its arguments. Maximizing this correlative information measure between alternative representations of the same input serves two purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it establishes relevance among alternative representations by increasing the linear dependence among them. An approximation of the proposed information maximization objective simplifies to a Euclidean distance-based objective function regularized by the log-determinant of the feature covariance matrix. The regularization term acts as a natural barrier against feature space degeneracy. Consequently, beyond avoiding complete output collapse to a single point, the proposed approach also prevents dimensional collapse by encouraging the spread of information across the whole feature space. Numerical experiments demonstrate that CorInfoMax achieves better or competitive performance results relative to the state-of-the-art SSL approaches.
翻译:自监管学习使AI系统能够从大量使用不需要昂贵标签的任务的数据中学习有效表述。模式崩溃,即对所有投入进行相同表述的模式,是许多自监督学习方法的一个中心问题,使自监督的任务,例如将投入的扭曲变体相匹配,无效。在本篇文章中,我们争辩说,在一个投入的替代潜在表示中直接应用信息最大化自然会解决崩溃问题,并取得竞争性的经验结果。我们建议了一种自监督学习方法,即CorInfoMax,该方法采用基于二阶统计的竞争性相互信息计量,该方法将反映所有投入之间的相关性程度。使同一投入的替代表述之间的相关性信息计量最大化有两个目的:(1)它通过产生非退化变异的特性矢量矢量矢量矢量来避免崩溃问题;(2)它通过增加它们之间的线性依赖来确定替代表示的相关性。拟议的信息最大化目标简单化为基于Euclideidean的远程目标功能,该方法使用了反映其参数相关性程度的二阶级相互测量度,该变量的相对性度度度度度度度度度度度度为稳定的一次稳定状态,从而显示整个空间稳定度度的状态,从而显示整个空间稳定度度度度度的状态,从而显示整个空间稳定度度度度度度度度度度度度度度度度的度度度度度度,从而显示整个的度度度度度度度度为稳定的度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度,从而显示为稳定的度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度量度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度,从而显示整个的度,从而显示整个度度度度度度度度量度度度度量度量度度度度度度度度度度度量度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度度量度量度度度度度度量度度度度度度度度度量度