Supervised learning on graphs is a challenging task due to the high dimensionality and inherent structural dependencies in the data, where each edge depends on a pair of vertices. Existing conventional methods designed for Euclidean data do not account for this graph dependency structure. To address this issue, this paper proposes an iterative vertex screening method to identify the signal subgraph that is most informative for the given graph attributes. The method screens the rows and columns of the adjacency matrix concurrently and stops when the resulting distance correlation is maximized. We establish the theoretical foundation of our method by proving that it estimates the true signal subgraph with high probability. Additionally, we establish the convergence rate of classification error under the Erdos-Renyi random graph model and prove that the subsequent classification can be asymptotically optimal, outperforming the entire graph under high-dimensional conditions. Our method is evaluated on various simulated datasets and real-world human and murine graphs derived from functional and structural magnetic resonance images. The results demonstrate its excellent performance in estimating the ground-truth signal subgraph and achieving superior classification accuracy.
翻译:暂无翻译