Pre-training Graph Neural Networks (GNN) via self-supervised contrastive learning has recently drawn lots of attention. However, most existing works focus on node-level contrastive learning, which cannot capture global graph structure. The key challenge to conducting subgraph-level contrastive learning is to sample informative subgraphs that are semantically meaningful. To solve it, we propose to learn graph motifs, which are frequently-occurring subgraph patterns (e.g. functional groups of molecules), for better subgraph sampling. Our framework MotIf-driven Contrastive leaRning Of Graph representations (MICRO-Graph) can: 1) use GNNs to extract motifs from large graph datasets; 2) leverage learned motifs to sample informative subgraphs for contrastive learning of GNN. We formulate motif learning as a differentiable clustering problem, and adopt EM-clustering to group similar and significant subgraphs into several motifs. Guided by these learned motifs, a sampler is trained to generate more informative subgraphs, and these subgraphs are used to train GNNs through graph-to-subgraph contrastive learning. By pre-training on the ogbg-molhiv dataset with MICRO-Graph, the pre-trained GNN achieves 2.04% ROC-AUC average performance enhancement on various downstream benchmark datasets, which is significantly higher than other state-of-the-art self-supervised learning baselines.
翻译:通过自我监督的对比性学习,培训前的图表神经网络(GNN)最近引起了许多关注。然而,大多数现有工作都侧重于节点水平对比学习,无法捕捉全球图形结构。进行下层对比学习的关键挑战是抽样具有语义意义的信息分层。为了解决这个问题,我们提议通过自我监督的对比性学习图示图示模式(如分子功能组),以更好地进行子谱抽样。我们的框架Motif驱动的图示显示(MICRO-Graph)可以:1)利用GNNS从大图表数据集中提取图示;2利用模型来抽样信息分层分层,以比较性能。我们将图示学习作为一种不同的组合问题,并将EM集组应用到多个类似和重要的子集中。根据这些已学的高级模型,一个取样员被训练到更高层次的图示(MIC-GNFA),这些次级基准的成绩通过GNFA学习前的GNFA