Graph data are ubiquitous in the real world. Graph learning (GL) tries to mine and analyze graph data so that valuable information can be discovered. Existing GL methods are designed for centralized scenarios. However, in practical scenarios, graph data are usually distributed in different organizations, i.e., the curse of isolated data islands. To address this problem, we incorporate federated learning into GL and propose a general Federated Graph Learning framework FedGL, which is capable of obtaining a high-quality global graph model while protecting data privacy by discovering the global self-supervision information during the federated training. Concretely, we propose to upload the prediction results and node embeddings to the server for discovering the global pseudo label and global pseudo graph, which are distributed to each client to enrich the training labels and complement the graph structure respectively, thereby improving the quality of each local model. Moreover, the global self-supervision enables the information of each client to flow and share in a privacy-preserving manner, thus alleviating the heterogeneity and utilizing the complementarity of graph data among different clients. Finally, experimental results show that FedGL significantly outperforms baselines on four widely used graph datasets.
翻译:图表学习 (GL) 试图对图表数据进行埋存和分析, 以便发现有价值的信息。 现有的GL 方法是针对集中情景设计的。 但是, 在实际情况下, 图表数据通常在不同的组织中分布, 即孤立数据岛屿的诅咒。 为了解决这个问题, 我们将联合学习纳入GL, 并提议一个通用的联邦图表学习框架 FedGL, 它能够获取高质量的全球图形模型, 同时通过在联合培训中发现全球自我监督信息来保护数据隐私。 具体地说, 我们提议将预测结果和节点嵌入服务器, 以发现全球伪标签和全球伪图, 分发给每个客户, 以丰富培训标签, 补充图形结构, 从而提高每个本地模型的质量 。 此外, 全球自我监督使每个客户的信息能够以保密的方式流动和共享, 从而减轻异质性, 并利用不同客户之间图表数据的互补性 。 最后, 实验结果显示, FedGL 明显地使用了四个图形 。