Graph Neural Network (GNN) research is rapidly growing thanks to the capacity of GNNs to learn representations from graph-structured data. However, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns, regulation restrictions, and commercial competition. Federated learning (FL), a trending distributed learning paradigm, aims to solve this challenge while preserving privacy. Despite recent advances in vision and language domains, there is no suitable platform for the federated training of GNNs. To this end, we introduce FedGraphNN, an open research federated learning system and a benchmark to facilitate GNN-based FL research. FedGraphNN is built on a unified formulation of federated GNNs and supports commonly used datasets, GNN models, FL algorithms, and flexible APIs. We also contribute a new molecular dataset, hERG, to promote research exploration. Our experimental results present significant challenges in federated GNN training: federated GNNs perform worse in most datasets with a non-I.I.D split than centralized GNNs; the GNN model that attains the best result in the centralized setting may not hold its advantage in the federated setting. These results imply that more research efforts are needed to unravel the mystery behind federated GNN training. Moreover, our system performance analysis demonstrates that the FedGraphNN system is computationally affordable to most research labs with limited GPUs. We maintain the source code at https://github.com/FedML-AI/FedGraphNN.
翻译:由于GNN有能力从图表结构的数据中学习演示,GNN的研究正在迅速增加。然而,由于用户方面的隐私关切、监管限制和商业竞争,为GNN培训集中大量真实世界图形数据令人望而生畏,因为用户方面的隐私关切、监管限制和商业竞争。Federal Learning(FL),一个分布式的分布式学习模式,旨在解决这一挑战,同时保护隐私。尽管在视觉和语言领域最近有所进步,但是没有适合GNNN公司联合培训的平台。为此,我们引入了FedGraphNN,一个公开的研究联合联合学习系统和基准,以便利GNNNF的FFFF研究。FGGGGLPNNNNN是建立在联合GNNNNNNNNNNNNGs的统一配制基础上,支持通用的数据集、GNNF的算法和灵活的APIPS。我们实验结果显示,F公司在最先进的数据配置和最先进的GNNNF的GGMGF内部分析中,我们最先进的GNNF公司在CF内部的计算中,这些研究成绩显示,而GNNNNNF在CF的计算中最优于中央的成绩分析中可以证明。