In this paper, we focus on unsupervised representation learning for clustering of images. Recent advances in deep clustering and unsupervised representation learning are based on the idea that different views of an input image (generated through data augmentation techniques) must be close in the representation space (exemplar consistency), and/or similar images must have similar cluster assignments (population consistency). We define an additional notion of consistency, consensus consistency, which ensures that representations are learned to induce similar partitions for variations in the representation space, different clustering algorithms or different initializations of a single clustering algorithm. We define a clustering loss by executing variations in the representation space and seamlessly integrate all three consistencies (consensus, exemplar and population) into an end-to-end learning framework. The proposed algorithm, consensus clustering using unsupervised representation learning (ConCURL), improves upon the clustering performance of state-of-the-art methods on four out of five image datasets. Furthermore, we extend the evaluation procedure for clustering to reflect the challenges encountered in real-world clustering tasks, such as maintaining clustering performance in cases with distribution shifts. We also perform a detailed ablation study for a deeper understanding of the proposed algorithm. The code and the trained models are available at https://github.com/JayanthRR/ConCURL_NCE.
翻译:在本文中,我们侧重于为图像群集进行不受监督的代表学习。在深度分组和不受监督的代表学习方面最近取得的进展是基于这样一种想法,即对输入图像的不同观点(通过数据增强技术产生的)必须接近代表空间(外观一致性),和/或相似图像必须具有相似的分组任务(人口一致性)。我们定义了另一个一致性、协商一致一致性的概念,以确保通过学习表达来为代表空间的差异、不同的组合算法或单一组合算法的不同初始化带来类似的分割。我们通过在代表空间执行变异和无缝地将所有三种组合(consensius、eximal和人口)纳入一个端到端端学习框架来界定集群损失。我们提出的算法、使用未经监督的代表学习(CURL)的协商一致组合,改进了五个图像集集图的四种方法的组合绩效。此外,我们扩展了组合评价程序,以反映在现实世界组合任务中遇到的挑战,如在分配变化中保持组合工作绩效。我们还在进行一项详细研究。在分配变化中进行一项详细研究。