How can you sample good negative examples for contrastive learning? We argue that, as with metric learning, contrastive learning of representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor point). The key challenge toward using hard negatives is that contrastive methods must remain unsupervised, making it infeasible to adopt existing negative sampling strategies that use true similarity information. In response, we develop a new family of unsupervised sampling methods for selecting hard negative samples where the user can control the hardness. A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible. The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
翻译:如何为对比性学习采样良好的负面实例? 我们争辩说,与衡量学习一样,对比性地学习表态方法从硬性负抽样中受益(即难以区分点和锚点 ) 。 使用硬性负差的关键挑战是,对比性方法必须不受监督,因此采用现有的使用真正相似信息的现有负面抽样战略是不可行的。 作为回应,我们开发了一个由未经监督的抽样方法组成的新体系,用于选择硬性负抽样,用户可以控制硬性能。这种抽样的有限案例导致每类样本都集中集中,并尽可能将不同类别分开。 拟议的方法可以改善多种模式的下游性能,只需要少数额外的代码来实施,并且不引入计算性间接费用。