Contrastive learning is a recent promising approach in unsupervised representation learning where a feature representation of data is learned by solving a pseudo classification problem from unlabelled data. However, it is not straightforward to understand what representation contrastive learning yields. In addition, contrastive learning is often based on the maximum likelihood estimation, which tends to be vulnerable to the contamination by outliers. To promote the understanding to contrastive learning, this paper first theoretically shows a connection to maximization of mutual information (MI). Our result indicates that density ratio estimation is necessary and sufficient for maximization of MI under some conditions. Thus, contrastive learning related to density ratio estimation as done in popular objective functions can be interpreted as maximizing MI. Next, with the density ratio, we establish new recovery conditions for the latent source components in nonlinear independent component analysis (ICA). In contrast with existing work, the established conditions include a novel insight for the dimensionality of data, which is clearly supported by numerical experiments. Furthermore, inspired by nonlinear ICA, we propose a novel framework to estimate a nonlinear subspace for lower-dimensional latent source components, and some theoretical conditions for the subspace estimation are established with the density ratio. Then, we propose a practical method through outlier-robust density ratio estimation, which can be seen as performing maximization of MI, nonlinear ICA or nonlinear subspace estimation. Moreover, a sample-efficient nonlinear ICA method is also proposed. We theoretically investigate outlier-robustness of the proposed methods. Finally, the usefulness of the proposed methods is numerically demonstrated in nonlinear ICA and through application to linear classification.
翻译:对比性学习是非监督代表性学习中的一种近期有希望的方法,通过解决未贴标签的数据的假分类问题来学习数据特征的表示方式。然而,要理解何为显示对比性学习的效果并不简单。 此外,对比性学习往往基于最大可能性估计,这往往容易受到外部线人的污染。为了促进对对比性学习的理解,本文从理论上首先显示了与相互信息最大化的联系。我们的结果表明,在某些条件下,密度比率估计对于最大程度地利用MI是必要和充分的。因此,与在流行目标函数中进行的密度比率估计对比性学习可以被解释为最大程度的MI。接下来,随着密度比率,我们在非线性独立分析(ICA)中为潜伏源组成部分制定了新的恢复条件。与现有的工作相比,既定条件包括对数据维度的洞察,这得到数字实验的明确支持。此外,根据非线性ICA,我们提议了一个新的框架,用于估计低度潜伏源源的非线性亚空间线性次空间估计,而某些理论性程度的I-CA值比值估算,通过我们所展示的I-CA值估算的不实际性比值,也可以通过SI-CI-CI-I-CI-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I