Due to the significant computational challenge of training large-scale graph neural networks (GNNs), various sparse learning techniques have been exploited to reduce memory and storage costs. Examples include \textit{graph sparsification} that samples a subgraph to reduce the amount of data aggregation and \textit{model sparsification} that prunes the neural network to reduce the number of trainable weights. Despite the empirical successes in reducing the training cost while maintaining the test accuracy, the theoretical generalization analysis of sparse learning for GNNs remains elusive. To the best of our knowledge, this paper provides the first theoretical characterization of joint edge-model sparse learning from the perspective of sample complexity and convergence rate in achieving zero generalization error. It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy. Although the analysis is centered on two-layer GNNs with structural constraints on data, the insights are applicable to more general setups and justified by both synthetic and practical citation datasets.
翻译:由于培训大型图形神经网络(GNNs)在计算方面存在巨大的挑战,因此利用了各种稀疏的学习技术来减少记忆和储存成本,例如,\ textit{graph sparizization}抽样一项子谱以减少数据汇总量和\ textit{modelsparisization}使神经网络减少可训练重量数量。尽管在降低培训成本的同时保持测试准确性方面取得了成功,但对GNNs少学的理论概括分析仍然难以实现。根据我们的最佳知识,本文件从抽样复杂程度和趋同率的角度首次从理论角度对联合边缘模型少学作了描述,以达到零一般化错误。它从分析上证明,取样重要节点和微弱神经元与最小微微微度可以降低样本复杂性,提高趋同性,同时不损害测试准确性。虽然分析集中在对数据有结构性限制的两层GNNs,但这种洞察力适用于更一般性的设置,并且得到合成和实用引用数据集的证明。