Existing knowledge distillation methods focus on convolutional neural networks (CNNs), where the input samples like images lie in a grid domain, and have largely overlooked graph convolutional networks (GCN) that handle non-grid data. In this paper, we propose to our best knowledge the first dedicated approach to distilling knowledge from a pre-trained GCN model. To enable the knowledge transfer from the teacher GCN to the student, we propose a local structure preserving module that explicitly accounts for the topological semantics of the teacher. In this module, the local structure information from both the teacher and the student are extracted as distributions, and hence minimizing the distance between these distributions enables topology-aware knowledge transfer from the teacher, yielding a compact yet high-performance student model. Moreover, the proposed approach is readily extendable to dynamic graph models, where the input graphs for the teacher and the student may differ. We evaluate the proposed method on two different datasets using GCN models of different architectures, and demonstrate that our method achieves the state-of-the-art knowledge distillation performance for GCN models. Code is publicly available at https://github.com/ihollywhy/DistillGCN.PyTorch.
翻译:现有知识蒸馏方法侧重于进化神经网络(CNNs),输入样本如图象的样本位于网格域域内,基本上忽略了处理非网格数据的图形进化网络(GCN),在本文中,我们向我们最了解的各国提议从经过预先训练的GCN模型中提取知识的第一个专门方法。为了使教师GCN能够向学生传授知识,我们提议一个地方结构保存模块,明确说明教师的地形语义。在这个模块中,教师和学生的当地结构信息被提取成发行品,从而最大限度地减少这些分布之间的距离,从而能够从教师那里转移具有地貌学意识的知识,产生一个紧凑但业绩高的学生模型。此外,拟议的方法很容易推广到动态图形模型,使教师和学生的输入图可能有所不同。我们利用不同结构的GCN模型对两个不同数据集的拟议方法进行评估,并表明我们的方法达到了GCN模型的状态-CN知识蒸馏性表现。 代码可以公开在 http://rschirch/Gshistush.