We study the problem of designing hard negative sampling distributions for unsupervised contrastive representation learning. We analyze a novel min-max framework that seeks a representation which minimizes the maximum (worst-case) generalized contrastive learning loss over all couplings (joint distributions between positive and negative samples subject to marginal constraints) and prove that the resulting min-max optimum representation will be degenerate. This provides the first theoretical justification for incorporating additional regularization constraints on the couplings. We re-interpret the min-max problem through the lens of Optimal Transport theory and utilize regularized transport couplings to control the degree of hardness of negative examples. We demonstrate that the state-of-the-art hard negative sampling distributions that were recently proposed are a special case corresponding to entropic regularization of the coupling.
翻译:我们研究如何设计硬负抽样分布,以便进行不受监督的对比代表性学习。我们分析一个新的微轴框架,以寻求一种能将所有组合(受边际限制的正负抽样之间的联合分布)的最大(最坏情况)普遍对比性学习损失降到最低的表示方式,并证明由此产生的最小负抽样分布方式会退化。这为在组合中增加正规化限制提供了第一个理论理由。我们通过最佳运输理论的透镜重新解释微轴问题,并利用正规化的运输组合来控制负面例子的严谨程度。我们证明最近提出的最先进的硬负抽样分布方式与合并的成型相对应。