Self-supervised learning is a central component in recent approaches to deep multi-view clustering (MVC). However, we find large variations in the development of self-supervision-based methods for deep MVC, potentially slowing the progress of the field. To address this, we present DeepMVC, a unified framework for deep MVC that includes many recent methods as instances. We leverage our framework to make key observations about the effect of self-supervision, and in particular, drawbacks of aligning representations with contrastive learning. Further, we prove that contrastive alignment can negatively influence cluster separability, and that this effect becomes worse when the number of views increases. Motivated by our findings, we develop several new DeepMVC instances with new forms of self-supervision. We conduct extensive experiments and find that (i) in line with our theoretical findings, contrastive alignments decreases performance on datasets with many views; (ii) all methods benefit from some form of self-supervision; and (iii) our new instances outperform previous methods on several datasets. Based on our results, we suggest several promising directions for future research. To enhance the openness of the field, we provide an open-source implementation of DeepMVC, including recent models and our new instances. Our implementation includes a consistent evaluation protocol, facilitating fair and accurate evaluation of methods and components.
翻译:摘要:自我监督学习是深层多视角聚类(MVC)近期方法的核心组成部分。然而,我们发现针对基于自我监控方法的深层MVC的发展存在很大变化,可能会减缓该领域的进展。为此,我们提出DeepMVC,这是一个统一的深度MVC框架,包含许多最新的方法作为实例。我们利用我们的框架对自我监督的影响做出了重要观察,特别是对比较学习对表示的影响。此外,我们证明了对比度对齐可能会负面影响簇分离性能,并且这种影响在视图数量增加时变得更糟。受我们的研究结果启发,我们开发了几个具有新形式自我监督的DeepMVC实例。我们进行了大量实验,并发现(i)根据我们的理论结果,对比对齐会降低具有许多视图的数据集的性能;(ii)所有方法都可以从某种形式的自我监督中受益;(iii)我们的新实例在几个数据集上优于先前的方法。基于我们的结果,我们建议未来研究的几个有前途的方向。为了增强该领域的开放性,我们提供了DeepMVC的开源实现,包括最近的模型和我们的新实例。我们实现了一致的评估协议,便于公正准确地评估方法和组件的性能。