Attributed graph clustering, which learns node representation from node attribute and topological graph for clustering, is a fundamental but challenging task for graph analysis. Recently, methods based on graph contrastive learning (GCL) have obtained impressive clustering performance on this task. Yet, we observe that existing GCL-based methods 1) fail to benefit from imprecise clustering labels; 2) require a post-processing operation to get clustering labels; 3) cannot solve out-of-sample (OOS) problem. To address these issues, we propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC). In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, which aims to maximize the similarities of intra-cluster nodes while minimizing the similarities of inter-cluster nodes, are designed for node representation learning. Meanwhile, a clustering module is built to directly output clustering labels by contrasting the representation of different clusters. Thus, for the OOS nodes, SCAGC can directly calculate their clustering labels. Extensive experimental results on four benchmark datasets have shown that SCAGC consistently outperforms 11 competitive clustering methods.
翻译:以图表对比性学习(GCL)为基础的方法在这项工作上取得了令人印象深刻的群集业绩。然而,我们注意到,以图表对比性学习(GCL)为基础的方法1 未能从不精确的群集标签中受益;2 需要后处理操作来获得群集标签;3 无法解决群集的外标问题。为了解决这些问题,我们提议建立一个新型的标注式群集网络,即自监督的反比属性群集(SCACGC)。在SCACGC中,通过利用不准确的群集标签,一种自我监督的对比性损失,目的是尽量扩大群集内结点的相似性,同时尽量减少群集节点之间的相似性,目的是为了了解节点。与此同时,通过对比不同群集的代表性,建立一个组合模块,直接输出群集标签。因此,SCGC可以直接计算其群集标签。在四个基准数据集上的广泛实验结果显示,SACGC始终具有竞争力的组合方法。