Deep graph neural networks (GNNs) have been shown to be expressive for modeling graph-structured data. Nevertheless, the over-stacked architecture of deep graph models makes it difficult to deploy and rapidly test on mobile or embedded systems. To compress over-stacked GNNs, knowledge distillation via a teacher-student architecture turns out to be an effective technique, where the key step is to measure the discrepancy between teacher and student networks with predefined distance functions. However, using the same distance for graphs of various structures may be unfit, and the optimal distance formulation is hard to determine. To tackle these problems, we propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD, which adversarially trains a discriminator and a generator to adaptively detect and decrease the discrepancy. Specifically, noticing that the well-captured inter-node and inter-class correlations favor the success of deep GNNs, we propose to criticize the inherited knowledge from node-level and class-level views with a trainable discriminator. The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator. To our best knowledge, GraphAKD is the first to introduce adversarial training to knowledge distillation in graph domains. Experiments on node-level and graph-level classification benchmarks demonstrate that GraphAKD improves the student performance by a large margin. The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
翻译:深图神经网络(GNNs)已被证明是用于模拟图形结构数据模型的缩略图。 然而,深图模型的过大结构结构使得难以在移动或嵌入系统中部署和快速测试。 压缩过重的GNNs, 通过教师-学生结构进行知识蒸馏是一个有效的技术, 关键步骤是测量具有预先界定的距离功能的教师与学生网络之间的差异。 但是, 使用不同结构图的同样距离可能不合适, 最佳距离配方很难确定这些问题。 为了解决这些问题, 我们为名为TapAKD 的图形模型提出了一个新型的Adversarial知识蒸馏框架, 以对抗方式训练导师式的导师式和导师式生成器来适应性地检测和减少差异。 具体地说, 注意深层次GNNNPs之间的交际和跨级关系有利于深层次的GNNS的成功。 我们提议用一个可训练的师级标准来批评从节级和类级观点中传承下来的知识。