In this paper, we focus on deep clustering and unsupervised representation learning for images. Recent advances in deep clustering and unsupervised representation learning are based on the idea that different views of an input image (generated through data augmentation techniques) must be closer in the representation space (exemplar consistency), and/or similar images have a similar cluster assignment (population consistency). We define an additional notion of consistency, consensus consistency, which ensures that representations are learnt to induce similar partitions for variations in the representation space, different clustering algorithms or different initializations of a clustering algorithm. We define a clustering loss by performing variations in the representation space and seamlessly integrate all three consistencies (consensus, exemplar and population) into an end-to-end learning framework. The proposed algorithm, Consensus Clustering using Unsupervised Representation Learning (ConCURL) improves the clustering performance over state-of-the art methods on four out of five image datasets. Further, we extend the evaluation procedure for clustering to reflect the challenges in real world clustering tasks, such as clustering performance in the case of distribution shift. We also perform a detailed ablation study for a deeper understanding of the algorithm.
翻译:在本文中,我们侧重于对图像的深度分组和不受监督的演示学习。在深度分组和不受监督的演示学习方面最近的进展是基于这样一种想法,即对输入图像的不同观点(通过数据增强技术产生的)在演示空间中必须更加接近(外观一致性),和/或相似图像具有类似的集群任务(人口一致性)。我们定义了另一个一致性、协商一致一致性的概念,以确保在演示中学会为代表空间的差异、不同的组合算法或组合算法的不同初始化产生类似的分割。我们通过在演示空间中进行变异和无缝地将所有三种组合(一致性、特例性和人口)纳入端到端端学习框架中来界定了集群损失。拟议的算法,即使用非超强代表学习的共识组合(ConCURL)改进了在五个图像数据集中的四种最新方法上的组合性能。此外,我们扩大了对组合的评估程序,以反映真实世界集群任务的挑战,例如分配变化情况下的组合绩效。我们还进行了详细的算法研究,以便更深入地理解更深入地理解。