Given the prevalence of large-scale graphs in real-world applications, the storage and time for training neural models have raised increasing concerns. To alleviate the concerns, we propose and study the problem of graph condensation for graph neural networks (GNNs). Specifically, we aim to condense the large, original graph into a small, synthetic and highly-informative graph, such that GNNs trained on the small graph and large graph have comparable performance. We approach the condensation problem by imitating the GNN training trajectory on the original graph through the optimization of a gradient matching loss and design a strategy to condense node futures and structural information simultaneously. Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs. In particular, we are able to approximate the original test accuracy by 95.3% on Reddit, 99.8% on Flickr and 99.0% on Citeseer, while reducing their graph size by more than 99.9%, and the condensed graphs can be used to train various GNN architectures.
翻译:鉴于现实世界应用中大规模图表的流行性,神经模型的储存和时间培训已引起越来越多的关注。为了减轻关注,我们提出并研究图形神经网络(GNNS)的图形凝结问题。具体地说,我们的目标是将大型原始图形压缩成一个小型、合成和高度信息化的图形,这样,在小图和大图方面受过培训的GNN公司就具有可比性能。我们通过优化梯度匹配损失和同时设计压缩节点未来和结构信息的战略,来应对凝结问题。广泛的实验表明,拟议的框架在将不同的图表数据集压缩成信息性较小的图表方面是有效的。特别是,我们能够在Reddit上将原测试精度的精确度大约为95.3%,在Flickr上为99.8%,在Citeseer上为99.0%,同时将其图形尺寸缩小99.9%以上,而压缩的图表可用于培训各种GNNN的建筑。