Given the prevalence of large-scale graphs in real-world applications, the storage and time for training neural models have raised increasing concerns. To alleviate the concerns, we propose and study the problem of graph condensation for graph neural networks (GNNs). Specifically, we aim to condense the large, original graph into a small, synthetic and highly-informative graph, such that GNNs trained on the small graph and large graph have comparable performance. We approach the condensation problem by imitating the GNN training trajectory on the original graph through the optimization of a gradient matching loss and design a strategy to condense node futures and structural information simultaneously. Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs. In particular, we are able to approximate the original test accuracy by 95.3% on Reddit, 99.8% on Flickr and 99.0% on Citeseer, while reducing their graph size by more than 99.9%, and the condensed graphs can be used to train various GNN architectures.Code is released at https://github.com/ChandlerBang/GCond.
翻译:鉴于现实世界应用中大规模图表的流行性,神经模型的储存和时间培训已引起越来越多的关注。为了减轻关注,我们提出并研究图形神经网络(GNNS)的图形凝结问题。具体地说,我们的目标是将大型原始图形压缩成一个小型、合成和高度有见度的图形,使受过小图和大图培训的GNN公司具有可比性能。我们通过优化梯度匹配损失和同时设计压缩节点未来和结构信息的战略,在原始图表上模仿GNN公司的培训轨迹,从而解决了凝结问题。广泛的实验表明,拟议的框架在将不同的图表数据集压缩成信息性小图方面是有效的。特别是,我们能够在Reddit上将最初的测试精确率大约为95.3%,在Flickr上为99.8%,在Citeseer上为99.0%,同时将其图形缩小99.9%以上,而压缩的图表可用于培训各种GNNNE建筑。Cde在https://gthrudub.com/ChandBang/Cang/Cosd上公布。