Graphs are widely used to represent the relations among entities. When one owns the complete data, an entire graph can be easily built, therefore performing analysis on the graph is straightforward. However, in many scenarios, it is impractical to centralize the data due to data privacy concerns. An organization or party only keeps a part of the whole graph data, i.e., graph data is isolated from different parties. Recently, Federated Learning (FL) has been proposed to solve the data isolation issue, mainly for Euclidean data. It is still a challenge to apply FL on graph data because graphs contain topological information which is notorious for its non-IID nature and is hard to partition. In this work, we propose a novel FL framework for graph data, FedCog, to efficiently handle coupled graphs that are a kind of distributed graph data, but widely exist in a variety of real-world applications such as mobile carriers' communication networks and banks' transaction networks. We theoretically prove the correctness and security of FedCog. Experimental results demonstrate that our method FedCog significantly outperforms traditional FL methods on graphs. Remarkably, our FedCog improves the accuracy of node classification tasks by up to 14.7%.
翻译:图表被广泛用于代表各个实体之间的关系。 当一个人拥有完整数据时, 整个图表可以很容易地构建, 从而可以在图表上进行直截了当的分析。 但是, 在很多情况下, 由于数据隐私方面的关注, 集中数据是不切实际的。 一个组织或缔约方只保留了整个图表数据的一部分, 即图表数据与不同缔约方隔开。 最近, 联邦学习( FL) 提议解决数据隔离问题, 主要是欧几里得数据 。 在图形数据上应用 FL 仍是一个挑战, 因为图形含有因非IID性质而臭名昭著并且难以分割的地形信息。 在这项工作中, 我们建议为图表数据建立一个新的 FL 框架( FedCog), 以高效地处理图表数据中的一部分, 即图表数据是某种分布式的图形数据, 但是在现实世界的各种应用中广泛存在, 如移动载体通信网络和银行交易网络。 我们理论上证明 FedCog 的正确性和安全性。 实验结果表明, 我们的方法 FedCog 明显地超越了图表上传统的FL 方法。