It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. However, there is little explanation of why it works empirically from the viewpoint of learning theory. In this study, we derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs. Using the boosting theory, we prove the convergence of the training error under weak learning-type conditions. By combining it with generalization gap bounds in terms of transductive Rademacher complexity, we show that a test error bound of a specific type of multi-scale GNNs that decreases corresponding to the number of node aggregations under some conditions. Our results offer theoretical explanations for the effectiveness of the multi-scale structure against the over-smoothing problem. We apply boosting algorithms to the training of multi-scale GNNs for real-world node prediction tasks. We confirm that its performance is comparable to existing GNNs, and the practical behaviors are consistent with theoretical observations. Code is available at https://github.com/delta2323/GB-GNN.
翻译:众所周知,目前的图形神经网络(GNN)由于所谓的过度透透问题而难以深入。多尺度GNN是缓解过度透透问题的一个很有希望的方法。然而,从学习理论的角度来看,很难解释为什么它从经验角度出发发挥作用。在本研究中,我们从包括多尺度GNN在内的传输学算法的优化和普及保障中得出了包括多尺度GNN在内的传输学算法的优化和普及性保障。我们使用推力理论,证明在薄弱的学习类型条件下培训错误是趋同的。通过将它与转导Rademacher复杂程度方面的普遍化差距界限结合起来,我们表明,一种特定类型的多尺度GNNN的测试错误是结合起来的,在某些条件下与节透透透透综合数的数量相对应的。我们的结果为多尺度结构对抗超尺度GNNNN的效能提供了理论解释。我们用推算法来培训多尺度GNNNNS进行现实世界节点预测任务。我们确认,其性能与现有的GNNNNM23/GCO是一致的。