It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as \textit{over-smoothing}. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. However, there is little explanation of why it works empirically from the viewpoint of learning theory. In this study, we derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs. Using the boosting theory, we prove the convergence of the training error under weak learning-type conditions. By combining it with generalization gap bounds in terms of transductive Rademacher complexity, we show that a test error bound of a specific type of multi-scale GNNs that decreases corresponding to the depth under the conditions. Our results offer theoretical explanations for the effectiveness of the multi-scale structure against the over-smoothing problem. We apply boosting algorithms to the training of multi-scale GNNs for real-world node prediction tasks. We confirm that its performance is comparable to existing GNNs, and the practical behaviors are consistent with theoretical observations. Code is available at https://github.com/delta2323/GB-GNN
翻译:众所周知,由于所谓的“ textit{over-moothing}”问题,目前的图形神经网络(GNN)很难深入。多尺度GNN是缓解过度移动问题的一个很有希望的方法。然而,从学习理论的角度来看,很难解释为什么它从经验上发挥作用。在本研究中,我们从包括多尺度GNN在内的传输学习算法的优化和普及保证中得出最佳和普及的保证。我们利用提炼理论,在学习类型薄弱的条件下证明培训错误的趋同。我们通过将其与转导Rademacher复杂程度方面的一般化差距界限结合起来,我们证明它与某种特定类型的多尺度GNNN的试验错误捆绑在一起,在条件下与深度相应下降。我们的结果为多尺度结构对抗超尺度GNNN的效能提供了理论解释。我们用推算法来训练多尺度GNNPs进行现实世界节点预测任务。我们确认,其性能与现有的GNNN/GNM23/实际行为在GGB的理论上是一致的。