Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets. However, natural data could be highly correlated and long-tail distributed. Natural between-instance similarity conflicts with the presumed instance distinction, causing unstable training and poor performance. Our idea is to discover and integrate between-instance similarity into contrastive learning, not directly by instance grouping, but by cross-level discrimination (CLD) between instances and local instance groups. While invariant mapping of each instance is imposed by attraction within its augmented views, between-instance similarity could emerge from common repulsion against instance groups. Our batch-wise and cross-view comparisons also greatly improve the positive/negative sample ratio of contrastive learning and achieve better invariant mapping. To effect both grouping and discrimination objectives, we impose them on features separately derived from a shared representation. In addition, we propose normalized projection heads and unsupervised hyper-parameter tuning for the first time. Our extensive experimentation demonstrates that CLD is a lean and powerful add-on to existing methods such as NPID, MoCo, InfoMin, and BYOL on highly correlated, long-tail, or balanced datasets. It not only achieves new state-of-the-art on self-supervision, semi-supervision, and transfer learning benchmarks, but also beats MoCo v2 and SimCLR on every reported performance attained with a much larger compute. CLD effectively brings unsupervised learning closer to natural data and real-world applications. Our code is publicly available at: https://github.com/frank-xwang/CLD-UnsupervisedLearning.
翻译:未经监督的特征学习取得了长足进步,在实例歧视的基础上进行了对比性学习,并且以分类平衡的数据集为基准,对不同实例进行了差异性绘图,取得了巨大的进步。然而,自然数据可能具有高度关联性和长相分布。自然地与假设的区别发生相似性冲突,导致培训不稳定和业绩差。我们的想法是发现并整合从不同事件到地方实例群体之间的对比性学习,而不是直接通过实例分组,而是通过不同实例群体之间的跨层次歧视(CLD)。虽然每个实例的不变化性绘图都是通过在其扩大的基底中吸引的,但从比较组群的常见的自然反差中可以产生更大的相似性。我们的批次和交叉比较比较还大大改进了对比性学习的积极性/负性抽样比率,并实现更好的差异性绘图。为了实现分组和歧视目标,我们将两者分别从共同代表制成的特征。此外,我们建议标准化的投影头和不超超直超直级的超直径直径直调第一次。我们的广泛实验显示,在不精锐和强大的内部的常规应用中,不仅精锐性、更精锐性、更精确、更精确、更精确、更精确的自动的自我智能应用,而且更精确的自动的、更精确的、更精确的运行的运行的运行的运行的运行的运行的运行的运行,也只是、更精确的、更精确的自动的运行、更精确的运行、更精确的自动的运行,以及更精确的运行的运行的运行的运行,以及更精确的数据,以及不断的运行到新的数据。