In recent years, several results in the supervised learning setting suggested that classical statistical learning-theoretic measures, such as VC dimension, do not adequately explain the performance of deep learning models which prompted a slew of work in the infinite-width and iteration regimes. However, there is little theoretical explanation for the success of neural networks beyond the supervised setting. In this paper we argue that, under some distributional assumptions, classical learning-theoretic measures can sufficiently explain generalization for graph neural networks in the transductive setting. In particular, we provide a rigorous analysis of the performance of neural networks in the context of transductive inference, specifically by analysing the generalisation properties of graph convolutional networks for the problem of node classification. While VC Dimension does result in trivial generalisation error bounds in this setting as well, we show that transductive Rademacher complexity can explain the generalisation properties of graph convolutional networks for stochastic block models. We further use the generalisation error bounds based on transductive Rademacher complexity to demonstrate the role of graph convolutions and network architectures in achieving smaller generalisation error and provide insights into when the graph structure can help in learning. The findings of this paper could re-new the interest in studying generalisation in neural networks in terms of learning-theoretic measures, albeit in specific problems.
翻译:近些年来,有监督的学习环境的一些结果表明,典型的统计学理论-理论计量方法,如VC维度,不能充分解释深层次学习模型的性能,这些模型在无限宽宽度和迭代制度下引发了大量工作。然而,对于神经网络的成功,除了有监督的环境之外,没有多少理论解释。在本文中,我们认为,根据某些分配假设,古老的学习理论-理论计量方法可以充分解释图解神经网络在传输环境中的概括性。我们尤其对神经网络在感知性推断中的表现进行严格分析,特别是分析图解变和网络网络网络网络在节点分类问题上的概括性能。虽然VC维度确实导致在这一环境中的微小一般误差,但我们也表明,根据某些分布式的图变异性网络的概括性特性可以解释图变异性特性。我们进一步使用基于转换式雷德马赫赫复杂度的概括性误差,以展示图变异性和网络在图变异性变化和网络结构中的作用。在研究一般变判时,可以帮助研究一般的图学研究一般误判。