Contrastive loss and its variants have become very popular recently for learning visual representations without supervision. In this work, we study three intriguing properties of contrastive learning. We first generalize the standard contrastive loss to a broader family of losses, and we find that various instantiations of the generalized loss perform similarly under the presence of a multi-layer non-linear projection head. We then study if instance-based contrastive learning (such as in SimCLR, MoCo, BYOL, and so on, which are based on global image representation) can learn well on images with multiple objects present. We find that meaningful hierarchical local features can be learned despite the fact that these objectives operate on global instance-level features. Finally, we study an intriguing phenomenon of feature suppression among competing features shared across augmented views, such as "color distribution" vs "object class". We construct datasets with explicit and controllable competing features, and show that, for contrastive learning, a few bits of easy-to-learn shared features can suppress, and even fully prevent, the learning of other sets of competing features. In scenarios where there are multiple objects in an image, the dominant object would suppress the learning of smaller objects. Existing contrastive learning methods critically rely on data augmentation to favor certain sets of features over others, and face potential limitation for scenarios where existing augmentations cannot fully address the feature suppression. This poses open challenges to existing contrastive learning techniques.
翻译:对比性损失及其变体最近变得非常受欢迎, 用于在没有监督的情况下学习视觉表现。 在这项工作中, 我们研究了三个令人感兴趣的对比性学习特征。 我们首先将标准对比性损失概括为更广泛的损失类别, 我们发现, 普遍损失的各种瞬间现象在多层非线性投影头的出现下也表现得类似。 我们接着研究以实例为基础的对比性学习( 如SimCLR、MoCL、BYOL等以全球图像表现为基础的) 能够很好地了解现有多个对象的图像。 我们发现, 尽管这些目标是以全球实例层面的特征运作, 但仍可以学到有意义的地方等级特征。 最后, 我们研究的是, 普遍损失的各种特征之间相互竞争的现象, 比如“ 颜色分布” 与“ 缩影类” 。 我们构建基于实例的对比性学习数据集, 显示, 以对比性学习简单易读取的共同特征, 能够抑制, 甚至完全防止学习其他相竞合的相异性特征。 在多重图像中, 无法完全地 学习现有图像中 。