包含样本对等制约的深度多视半监督集群 (Deep Multi-view Semi-supervised Clustering with Sample Pairwise Constraints)

Multi-view clustering has attracted much attention thanks to the capacity of multi-source information integration. Although numerous advanced methods have been proposed in past decades, most of them generally overlook the significance of weakly-supervised information and fail to preserve the feature properties of multiple views, thus resulting in unsatisfactory clustering performance. To address these issues, in this paper, we propose a novel Deep Multi-view Semi-supervised Clustering (DMSC) method, which jointly optimizes three kinds of losses during networks finetuning, including multi-view clustering loss, semi-supervised pairwise constraint loss and multiple autoencoders reconstruction loss. Specifically, a KL divergence based multi-view clustering loss is imposed on the common representation of multi-view data to perform heterogeneous feature optimization, multi-view weighting and clustering prediction simultaneously. Then, we innovatively propose to integrate pairwise constraints into the process of multi-view clustering by enforcing the learned multi-view representation of must-link samples (cannot-link samples) to be similar (dissimilar), such that the formed clustering architecture can be more credible. Moreover, unlike existing rivals that only preserve the encoders for each heterogeneous branch during networks finetuning, we further propose to tune the intact autoencoders frame that contains both encoders and decoders. In this way, the issue of serious corruption of view-specific and view-shared feature space could be alleviated, making the whole training procedure more stable. Through comprehensive experiments on eight popular image datasets, we demonstrate that our proposed approach performs better than the state-of-the-art multi-view and single-view competitors.

翻译：由于多来源信息整合的能力,多观点集群吸引了人们的极大关注。尽管在过去几十年中提出了许多先进的方法,但大多数方法一般都忽略了监督不力的信息的重要性,未能保存多种观点的特性,从而导致组合性业绩不令人满意。为了解决这些问题,我们在本文件中提出了一种新的深多观点半监督的集群(DMC)方法,该方法在网络微调过程中联合优化三种类型的损失,包括多观点集群损失、半监督的对配制限制损失和多自动分解器重建损失。具体地说,基于多观点组合损失的KL差异是针对多观点数据共同表示的,目的是同时进行多种特征优化、多观点加权和组合预测。然后,我们创新地提议将双向制约纳入多观点组合进程,通过对多观点代表必须链接样本(无法链接的样本)的多视角表述方法类似(不同 ), 使形成的综合组合结构更加可信。此外,与现有的对立者不同的是,我们只为每个混合的分部提供更严肃的图像,在微调过程中,我们只能对每个组合式的编码进行更精确的图像进行更精确的图像。