While federated learning (FL) enables distributed agents to collaboratively train a centralized model without sharing data with each other, it fails to protect users against inference attacks that mine private information from the centralized model. Thus, facilitating federated learning methods with differential privacy (DPFL) becomes attractive. Existing algorithms based on privately aggregating clipped gradients require many rounds of communication, which may not converge, and cannot scale up to large-capacity models due to explicit dimension-dependence in its added noise. In this paper, we adopt the knowledge transfer model of private learning pioneered by Papernot et al. (2017; 2018) and extend their algorithm PATE, as well as the recent alternative PrivateKNN (Zhu et al., 2020) to the federated learning setting. The key difference is that our method privately aggregates the labels from the agents in a voting scheme, instead of aggregating the gradients, hence avoiding the dimension dependence and achieving significant savings in communication cost. Theoretically, we show that when the margins of the voting scores are large, the agents enjoy exponentially higher accuracy and stronger (data-dependent) differential privacy guarantees on both agent-level and instance-level. Extensive experiments show that our approach significantly improves the privacy-utility trade-off over the current state-of-the-art in DPFL.
翻译:虽然联谊学习(FL)使分布式代理商能够在不相互共享数据的情况下合作培训集中型模式,但它未能保护用户免受从集中型模式中挖掘私人信息的推断攻击,因此,便利采用具有不同隐私的联谊学习方法(DPFL)变得有吸引力。基于私人汇总剪切梯子的现有算法需要多轮沟通,这些算法可能无法汇合,并且由于在增加的噪音中明显依赖多维度而不能推广到大容量模式。在本文中,我们采用了由Papernot等人(2017年;2018年)率先推出的私人学习知识转移模式,并将他们的算法PATE(PATE)以及最近的替代法PONNN(Zhu等人,2020年)扩大到联邦化学习设置。关键区别是,我们的方法是私人汇总了投票制梯子的标签,而不是汇总了梯度,从而避免了维度依赖并节省了大量通信费用。理论上,我们表明,当投票分数幅度大时,代理商享有指数更高和(数据依赖)更强的私隐性保障。在代理-FLI-FAL一级和实级做法上大大改进了我们目前的隐私水平。