改进图神经网络的多节点任务标记技巧 (Improving Graph Neural Networks on Multi-node Tasks with Labeling Tricks)

In this paper, we provide a theory of using graph neural networks (GNNs) for \textit{multi-node representation learning}, where we are interested in learning a representation for a set of more than one node such as a link. Existing GNNs are mainly designed to learn single-node representations. When we want to learn a node-set representation involving multiple nodes, a common practice in previous works is to directly aggregate the single-node representations obtained by a GNN. In this paper, we show a fundamental limitation of such an approach, namely the inability to capture the dependence among multiple nodes in a node set, and argue that directly aggregating individual node representations fails to produce an effective joint representation for multiple nodes. A straightforward solution is to distinguish target nodes from others. Formalizing this idea, we propose \text{labeling trick}, which first labels nodes in the graph according to their relationships with the target node set before applying a GNN and then aggregates node representations obtained in the labeled graph for multi-node representations. The labeling trick also unifies a few previous successful works for multi-node representation learning, including SEAL, Distance Encoding, ID-GNN, and NBFNet. Besides node sets in graphs, we also extend labeling tricks to posets, subsets and hypergraphs. Experiments verify that the labeling trick technique can boost GNNs on various tasks, including undirected link prediction, directed link prediction, hyperedge prediction, and subgraph prediction. Our work explains the superior performance of previous node-labeling-based methods and establishes a theoretical foundation for using GNNs for multi-node representation learning.

翻译：本文针对图神经网络（GNN）在多节点表示学习方面的应用进行了研究。在学习涉及多个节点（如链接）的节点集表示时，现有的GNN主要是设计用于学习单个节点的表示。在学习涉及多个节点的节点集表示时，先前的一些方法通常采用直接聚合GNN获得的单个节点表示。本文揭示了这种方法的根本局限性，即无法捕捉多个节点之间的依赖关系，并指出直接聚合单个节点表示无法产生有效的多节点联合表示。因此，我们提出了标记技巧，将目标节点从其他节点中区分出来，并在GNN应用之前通过标记将节点标记化，然后对标记过的图中获得的节点表示进行聚合，从而得到多节点联合表示。标记技巧还统一了几个以前成功的多节点表示学习方法，包括SEAL、距离编码、ID-GNN和NBFNet。除了图中的节点集之外，我们还将标记技巧扩展到偏序集、子集和超图中。实验验证了标记技巧技术可以提高各种任务中GNN的性能，包括无向链接预测、有向链接预测、超边预测和子图预测。本文解释了以前基于节点标记的方法的卓越性能，并为使用GNN进行多节点表示学习奠定了理论基础。