Incomplete multi-view clustering (IMVC) is an unsupervised approach, among which IMVC via contrastive learning has received attention due to its excellent performance. The previous methods have the following problems: 1) Over-reliance on additional projection heads when solving the dimensional collapse problem in which latent features are only valid in lower-dimensional subspaces during clustering. However, many parameters in the projection heads are unnecessary. 2) The recovered view contain inconsistent private information and useless private information will mislead the learning of common semantics due to consistent learning and reconstruction learning on the same feature. To address the above issues, we propose a novel incomplete multi-view contrastive clustering framework. This framework directly optimizes the latent feature subspace, utilizes the learned feature vectors and their sub-vectors for reconstruction learning and consistency learning, thereby effectively avoiding dimensional collapse without relying on projection heads. Since reconstruction loss and contrastive loss are performed on different features, the adverse effect of useless private information is reduced. For the incomplete data, the missing information is recovered by the cross-view prediction mechanism and the inconsistent information from different views is discarded by the minimum conditional entropy to further avoid the influence of private information. Extensive experimental results of the method on 5 public datasets show that the method achieves state-of-the-art clustering results.
翻译:不完整的多视角聚类(IMVC)是一种无监督方法,其中通过对比学习的IMVC由于其出色的表现而受到关注。以前的方法存在以下问题:1)当在聚类期间潜在特征仅在较低维子空间中有效时,对于防止维度坍塌问题过于依赖其他投影头。然而,投影头中的许多参数是不必要的。2)恢复的视图包含不一致的私有信息,无用的私有信息将在相同特征上的一致性学习和重构学习中误导共同语义的学习,为了解决上述问题,提出了一种新的不完整多视角对比聚类框架。该框架直接优化潜在特征子空间,利用学习的特征向量及其子向量进行重构学习和一致性学习,从而有效避免维度坍塌而无需依赖投影头。由于在不同特征上执行重构损失和对比损失,使得无用的私人信息的负面影响减小。对于不完整的数据,通过交叉视图预测机制恢复缺失的信息,通过最小条件熵丢弃来自不同视图的不一致信息,进一步避免了私有信息的影响。方法在5个公共数据集上的广泛实验结果表明,该方法实现了最先进的聚类结果。