Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.
翻译:最近,对比式学习发现,在解决各种机器学习任务方面,在提高先进水平方面取得了令人印象深刻的成功。然而,现有的一般化分析非常有限,甚至没有意义。特别是,现有的一般化错误取决于负面例子的金额,而在实践中,人们广泛表明,为了保证在下游任务中很好地普及对比性学习,必须选择大额的美元。在本文件中,我们建立了新的概括化框架,用于不同程度的对比性学习,但并不取决于美元,直到对数术语。我们的分析利用关于数字和Rademacher复杂性的经验结构结果来利用Lipschitz损失功能的连续性。对于自我约束的Lipschitz损失功能,我们通过开发乐观的界限来进一步改进我们的结果,这意味着在低噪音条件下快速的速率。我们运用我们的结果,通过深神经网络的线性代表和非线性代表来学习,对于两者,我们从Rademacher复杂程度的界限中获取改进一般化界限。</s>