Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data. However, little theoretical analysis was known for this framework. In this paper, we study the optimization of deep unsupervised contrastive learning. We prove that, by applying end-to-end training that simultaneously updates two deep over-parameterized neural networks, one can find an approximate stationary solution for the non-convex contrastive loss. This result is inherently different from the existing over-parameterized analysis in the supervised setting because, in contrast to learning a specific target function, unsupervised contrastive learning tries to encode the unlabeled data distribution into the neural networks, which generally has no optimal solution. Our analysis provides theoretical insights into the practical success of these unsupervised pretraining methods.
翻译:在最新研究中,未经监督的对比性学习日益受到关注,并被证明是从未贴标签的数据中学习表现的有力方法。然而,很少有人知道这一框架的理论分析。在本文中,我们研究了深层未经监督的对比性学习的优化。我们证明,通过应用端对端培训,同时更新两个深度的超分神经网络,人们可以找到一种近似固定的、非凝固的对比性损失解决方案。这一结果与监督环境中现有的过度分界线分析有着内在的不同,因为与学习特定目标功能相比,未经监督的对比性学习试图将未贴标签的数据传播编码到神经网络,而神经网络一般没有最佳的解决方案。我们的分析为这些未经监督的预先培训方法的实际成功提供了理论洞察力。