In this paper, we propose a one-stage online clustering method called Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning. To be specific, for a given dataset, the positive and negative instance pairs are constructed through data augmentations and then projected into a feature space. Therein, the instance- and cluster-level contrastive learning are respectively conducted in the row and column space by maximizing the similarities of positive pairs while minimizing those of negative ones. Our key observation is that the rows of the feature matrix could be regarded as soft labels of instances, and accordingly the columns could be further regarded as cluster representations. By simultaneously optimizing the instance- and cluster-level contrastive loss, the model jointly learns representations and cluster assignments in an end-to-end manner. Extensive experimental results show that CC remarkably outperforms 17 competitive clustering methods on six challenging image benchmarks. In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19\% (39\%) performance improvement compared with the best baseline.
翻译:在本文中,我们建议采用名为 " 对比组合(CC) " 的单阶段在线集群方法,明确进行实例和集群对比学习。具体地说,对于某一数据集,正对和负对实例对等通过数据增强构建,然后投射到一个特征空间。因此,实例和集群对比学习分别在行和列间进行,最大限度地扩大正对对的相似性,同时尽量减少负对的相似性。我们的主要观察是,特征矩阵的行可被视为实例的软标签,因此各栏可进一步视为集群表示。通过同时优化实例和集群对比损失,模型以端对端方式共同学习演示和集群任务。广泛的实验结果表明,CC明显地超越了六个具有挑战性图像基准的17种竞争性组合方法。特别是,CC在CIFAR-10(CIFAR-100)数据集上实现了0.705 NMI (0.431)的成绩,与最佳基线相比,业绩改进幅度高达19 ⁇ (39 ⁇ )。