Graphs are powerful representations for relations among objects, which have attracted plenty of attention. A fundamental challenge for graph learning is how to train an effective Graph Neural Network (GNN) encoder without labels, which are expensive and time consuming to obtain. Contrastive Learning (CL) is one of the most popular paradigms to address this challenge, which trains GNNs by discriminating positive and negative node pairs. Despite the success of recent CL methods, there are still two under-explored problems. First, how to reduce the semantic error introduced by random topology based data augmentations. Traditional CL defines positive and negative node pairs via the node-level topological proximity, which is solely based on the graph topology regardless of the semantic information of node attributes, and thus some semantically similar nodes could be wrongly treated as negative pairs. Second, how to effectively model the multiplexity of the real-world graphs, where nodes are connected by various relations and each relation could form a homogeneous graph layer. To solve these problems, we propose a novel multiplex heterogeneous graph prototypical contrastive leaning (X-GOAL) framework to extract node embeddings. X-GOAL is comprised of two components: the GOAL framework, which learns node embeddings for each homogeneous graph layer, and an alignment regularization, which jointly models different layers by aligning layer-specific node embeddings. Specifically, the GOAL framework captures the node-level information by a succinct graph transformation technique, and captures the cluster-level information by pulling nodes within the same semantic cluster closer in the embedding space. The alignment regularization aligns embeddings across layers at both node and cluster levels. We evaluate X-GOAL on various real-world datasets and downstream tasks to demonstrate its effectiveness.
翻译:图形学习的基本挑战是如何在没有标签的情况下训练一个有效的图形神经网络(GNN)编码器,这些编码器费用昂贵,而且需要花费时间才能获得。对比学习(CL)是应对这一挑战最受欢迎的模式之一,它通过区分正对和负节点对齐来训练GNN。尽管最近的 CL 方法取得了成功,但仍有两个探索不足的问题。首先,如何减少随机的基于下层数据扩增带来的语义错误。传统的 CL 定义通过节点水平的表层接近度对正和负节对齐,这完全基于图形表层,而不管节点属性的语义信息如何,因此,一些类似的语义节点可能被错误地对待为负对齐。第二,如何有效地模拟真实世界图的多维度,我们通过各种关系和每种关系可以形成一个相同的图形层。为了解决这些问题,我们建议一种新型的ODGOOO-正正正正正正对齐的直线性对齐度对齐结构结构结构, 将OG-al的平面结构向两个直径直径直径直方向方向方向方向图框架进行。