Recent advances in deep clustering and unsupervised representation learning are based on the idea that different views of an input image (generated through data augmentation techniques) must either be closer in the representation space, or have a similar cluster assignment. Bootstrap Your Own Latent (BYOL) is one such representation learning algorithm that has achieved state-of-the-art results in self-supervised image classification on ImageNet under the linear evaluation protocol. However, the utility of the learnt features of BYOL to perform clustering is not explored. In this work, we study the clustering ability of BYOL and observe that features learnt using BYOL may not be optimal for clustering. We propose a novel consensus clustering based loss function, and train BYOL with the proposed loss in an end-to-end way that improves the clustering ability and outperforms similar clustering based methods on some popular computer vision datasets.
翻译:深层集群和无人监督的演示学习的最新进展是基于这样一种想法,即对输入图像的不同观点(通过数据增强技术生成的)必须在显示空间中更为接近,或具有类似的组群任务。 Boutstrap your Own Lient(BYOL)是这种代表式学习算法,在线性评价协议下图像网络自我监督图像分类方面取得了最先进的成果。然而,BYOL所学到的功能对于进行群集的效用并没有被探讨。在这项工作中,我们研究BYOL的集合能力,并观察到使用BYOL所学的特征可能不是最佳的组合。我们提出一个新的基于共识的损失组合功能,并以端到端的方式对BYOL进行拟议的损失培训,以提高集成能力,并超越一些流行的计算机视觉数据集的类似组合方法。