Exploring the complementary information of multi-view data to improve clustering effects is a crucial issue in multi-view clustering. In this paper, we propose a novel model based on information theory termed Informative Multi-View Clustering (IMVC), which extracts the common and view-specific information hidden in multi-view data and constructs a clustering-oriented comprehensive representation. More specifically, we concatenate multiple features into a unified feature representation, then pass it through a encoder to retrieve the common representation across views. Simultaneously, the features of each view are sent to a encoder to produce a compact view-specific representation, respectively. Thus, we constrain the mutual information between the common representation and view-specific representations to be minimal for obtaining multi-level information. Further, the common representation and view-specific representation are spliced to model the refined representation of each view, which is fed into a decoder to reconstruct the initial data with maximizing their mutual information. In order to form a comprehensive representation, the common representation and all view-specific representations are concatenated. Furthermore, to accommodate the comprehensive representation better for the clustering task, we maximize the mutual information between an instance and its k-nearest neighbors to enhance the intra-cluster aggregation, thus inducing well separation of different clusters at the overall aspect. Finally, we conduct extensive experiments on six benchmark datasets, and the experimental results indicate that the proposed IMVC outperforms other methods.
翻译:探讨多视角数据的补充信息以改善集群效应是多视角集群的一个关键问题。在本文件中,我们提出了一个基于信息理论的新模式,即信息理论(Information-MIVC)的多视角集群(IMVC),其中摘录了多视角数据中隐藏的共同和特定观点信息,并构建了一个面向集群的全面代表制。更具体地说,我们将多种特征整合为一个统一的特征代表制,然后通过一个编码器将其传递给一个编码器,以检索各种观点的共同代表制。与此同时,每种观点的特征被发送到一个编码器,以产生一个缩略微的、针对具体观点的代表制代表制。因此,我们限制共同代表性和特定观点代表制之间的相互信息在获取多层面信息方面是最低限度的。此外,共同代表性和特定观点代表制还体现了每种观点的完善代表制模式,这种模式被注入一个分解器,以重建初始数据,同时尽可能扩大相互代表制。同时,每种观点的特征和所有具体视角的表述方式被整合在一起。此外,为了更好地容纳综合代表制任务的全面代表制,我们最大限度地利用共同代表性和特定观点代表制之间的相互信息,从而提升整个集群之间的总体数据。