Graph neural networks are among the most successful machine learning models for relational datasets like metabolic, transportation, and social networks. Yet the determinants of their strong generalization for diverse interactions encoded in the data are not well understood. Methods from statistical learning theory do not explain emergent phenomena such as double descent or the dependence of risk on the nature of interactions. We use analytical tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. The derived curves are phenomenologically rich: they explain the distinction between learning on homophilic and heterophilic and they predict double descent whose existence in GNNs has been questioned by recent work. We show how risk depends on the interplay between the noise in the graph, noise in the features, and the proportion of nodes used for training. Our analysis predicts qualitative behavior not only of a stylized graph learning model but also to complex GNNs on messy real-world datasets. As a case in point, we use these analytic insights about heterophily and self-loop signs to improve performance of state-of-the-art graph convolution networks on several heterophilic benchmarks by a simple addition of negative self-loop filters.
翻译:暂无翻译