We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.
翻译:我们显示,没有图表特定修改的标准变换器可以在理论和实践两方面的图形学习中带来有希望的结果。 图表显示, 我们只是将所有节点和边缘作为独立的标牌处理, 用象征性嵌入增加它们, 并将它们喂给一个变异器。 我们通过适当选择象征性嵌入, 证明这种方法在理论上至少像由等离子线层组成的无变数图形网络( 2-IGN)一样能表达出来, 它已经比所有电文传递图像神经网络( GNN)更能表达出来。 当在大型图形数据集( PCQM4Mv2) 上接受培训时, 我们的生成式图变形器( TokenGT) 与 GNN 基线和竞争性结果相比,取得了显著更好的效果。 我们的应用程序可以在 https:// github.com/jw9730/tokenggt上查阅 。