Existing deep embedding clustering works only consider the deepest layer to learn a feature embedding and thus fail to well utilize the available discriminative information from cluster assignments, resulting performance limitation. To this end, we propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC). Specifically, DAGC first utilizes a heterogeneity-wise fusion module to adaptively integrate the features of an auto-encoder and a graph convolutional network in each layer and then uses a scale-wise fusion module to dynamically concatenate the multi-scale features in different layers. Such modules are capable of learning a discriminative feature embedding via an attention-based mechanism. In addition, we design a distribution-wise fusion module that leverages cluster assignments to acquire clustering results directly. To better explore the discriminative information from the cluster assignments, we develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss. Extensive experiments validate that our method consistently outperforms state-of-the-art methods on six benchmark datasets. Especially, our method improves the ARI by more than 18.14% over the best baseline.
翻译:深度嵌入集群工作只考虑最深层的层层以学习嵌入特性,从而无法充分利用从集群任务中获取的歧视性信息,从而导致绩效限制。为此,我们提出一种新颖的方法,即用双重自我监督(DaGC)的深度关注引导图形集成。具体地说,DaGC首先使用一个不均匀的聚合模块,在适应性上整合每个层的自动编码器和图形相联网络的特征,然后使用一个规模错位组合模块,将多层的多尺度特征动态融合在一起。这些模块能够学习一种通过关注机制嵌入的歧视性特征。此外,我们设计了一个基于分配的组合模块,利用分组任务直接获得集成结果。为了更好地探索从集群任务中获取的歧视性信息,我们开发了一种双重自我监督解决方案,由软的自我监督战略组成,三重 Kullback- Leeper差异损失和硬性自我监督战略组成。广泛的实验证实,我们的方法持续超越了基于18项基准方法改进了我们的最佳方法。