Knowledge distillation is a learning paradigm for boosting resource-efficient graph neural networks (GNNs) using more expressive yet cumbersome teacher models. Past work on distillation for GNNs proposed the Local Structure Preserving loss (LSP), which matches local structural relationships defined over edges across the student and teacher's node embeddings. This paper studies whether preserving the global topology of how the teacher embeds graph data can be a more effective distillation objective for GNNs, as real-world graphs often contain latent interactions and noisy edges. We propose Graph Contrastive Representation Distillation (G-CRD), which uses contrastive learning to implicitly preserve global topology by aligning the student node embeddings to those of the teacher in a shared representation space. Additionally, we introduce an expanded set of benchmarks on large-scale real-world datasets where the performance gap between teacher and student GNNs is non-negligible. Experiments across 4 datasets and 14 heterogeneous GNN architectures show that G-CRD consistently boosts the performance and robustness of lightweight GNNs, outperforming LSP (and a global structure preserving variant of LSP) as well as baselines from 2D computer vision. An analysis of the representational similarity among teacher and student embedding spaces reveals that G-CRD balances preserving local and global relationships, while structure preserving approaches are best at preserving one or the other. Our code is available at https://github.com/chaitjo/efficient-gnns
翻译:知识蒸馏是一种学习模式,用以利用更直观而繁琐的教师模式,提升资源高效图形神经网络(GNNS),这是利用更清晰而繁琐的教师模式来提升资源高效图形神经网络(GNNS)的学习范例。过去关于GNNS的蒸馏工作提出了地方结构保护损失(LSP ), 与在学生和教师节点嵌入的边际外缘界定的地方结构相匹配。 本文研究是否保留了教师将图形数据嵌入图形数据如何成为GNNNS一个更有效的蒸馏目标的全球地形学, 因为真实世界的图形图往往包含潜在的互动和噪音边缘。 我们建议G-CRD 图表对比代表性蒸馏(G-C RD ), 利用对比性学习效率来默认全球地形学, 将学生节率GNNNUS 嵌入的学生节点嵌入与教师在共同代表空间空间中的最佳功能和稳健健, 显示GSP 2 的GSP 结构, 和GSP 正在维护我们保存的本地标准/RD 的系统结构,, 显示我们保存全球的保存的常规的系统结构, 和GSP 的常规的系统结构, 和GSP 显示全球的模型的模型的模型是全球的模型的模型的模型的模型的模型。