Graph learning, which aims to infer the underlying topology behind high dimension data, has attracted intense attention. In this study, we shed a new light on graph learning by considering a pragmatic scenario where data are privacy sensitive and located in separated clients (devices or organizations). The main difficulty in learning graphs in this scenario is that we cannot process all the data in a central server, because the data are not allowed to leave the local clients due to privacy concerns. The problem becomes more challenging when data of different clients are non-IID, since it is unreasonable to learn a global graph for heterogeneous data. To address these issues, we propose a novel framework in which a personalized graph for each client and a consensus graph are jointly learned in a federated fashion. Specifically, we commute model updates instead of raw data to the central server in the proposed federated algorithm. A provable convergence analysis shows that the algorithm enjoys $\mathcal{O}(1/T)$ convergence rate. To further enhance privacy, we design a deferentially privacy algorithm to prevent the information of the raw data from being leaked when transferring model updates. A theoretical guidance is provided on how to ensure that the algorithm satisfies differential privacy. We also analyze the impact of differential privacy on the convergence of our algorithm. Finally, extensive experiments on both synthetic and real world data are carried out to validate the proposed models and algorithms. Experimental results illustrate that our framework is able to learn graphs effectively in the target scenario.
翻译:图表学习旨在推断高维数据背后的基本地形学,因此引起了人们的高度注意。在本研究中,我们通过考虑一种实用的情景,即数据对隐私敏感,并且位于分离客户(设备或组织)中,我们为图表学习提供了新的光芒。在这一情景中,学习图表的主要困难是,我们无法在中央服务器处理所有数据,因为由于隐私问题,数据不允许离开当地客户。当不同客户的数据不是IID时,问题就变得更为棘手,因为学习一个全方位数据图是不合理的。为了解决这些问题,我们提出了一个新的框架,其中每个客户的个人化图表和一个共识图表以联合方式共同学习。具体地说,我们把模型更新而不是原始数据转移到拟议的联合算法中的中央服务器。一个可辨别的趋同分析表明,算法享有 $mathcal{O}(1/T)$的趋同率。为了进一步加强隐私,我们设计了一种推迟的隐私算法,以防止原始数据在传输模型更新时被泄露。我们提供了一种个人化的图表和协商一致的图表。一个理论指导,最终将保证了我们的数据的精确性分析结果。