Graph Transformers have demonstrated superiority on various graph learning tasks in recent years. However, the complexity of existing Graph Transformers scales quadratically with the number of nodes, making it hard to scale to graphs with thousands of nodes. To this end, we propose a Neighborhood Aggregation Graph Transformer (NAGphormer) that is scalable to large graphs with millions of nodes. Before feeding the node features into the Transformer model, NAGphormer constructs tokens for each node by a neighborhood aggregation module called Hop2Token. For each node, Hop2Token aggregates neighborhood features from each hop into a representation, and thereby produces a sequence of token vectors. Subsequently, the resulting sequence of different hop information serves as input to the Transformer model. By considering each node as a sequence, NAGphormer could be trained in a mini-batch manner and thus could scale to large graphs. NAGphormer further develops an attention-based readout function so as to learn the importance of each hop adaptively. We conduct extensive experiments on various popular benchmarks, including six small datasets and three large datasets. The results demonstrate that NAGphormer consistently outperforms existing Graph Transformers and mainstream Graph Neural Networks.
翻译:图形变异器近年来在各种图形学习任务上表现出优势。 然而, 现有的图形变异器的复杂程度随节点数量而变化, 使得很难以千个节点缩放成图表。 为此, 我们建议使用一个可缩放成有百万节点的大图的邻居聚合形变形器( NAGphormer ) 。 在将节点特性输入变异器模型之前, NAGphormer 将每个节点的标志由一个叫做 Hop2Token 的邻里组合模块为每个节点构建。 对于每个节点, 每一个节点, Hop2Token 集合邻居功能都会形成一个代表, 从而产生一个象征矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量。 因此, 将每个节点的顺序作为变形器的缩略图, NAGphormer 可以通过一个小节点的方式进行训练, 从而缩成大图表。 NAGFSDA 和GADA 连续显示六个模型。