Providing privacy protection has been one of the primary motivations of Federated Learning (FL). Recently, there has been a line of work on incorporating the formal privacy notion of differential privacy with FL. To guarantee the client-level differential privacy in FL algorithms, the clients' transmitted model updates have to be clipped before adding privacy noise. Such clipping operation is substantially different from its counterpart of gradient clipping in the centralized differentially private SGD and has not been well-understood. In this paper, we first empirically demonstrate that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity when training neural networks, which is partly because the clients' updates become similar for several popular deep architectures. Based on this key observation, we provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients' updates. To the best of our knowledge, this is the first work that rigorously investigates theoretical and empirical issues regarding the clipping operation in FL algorithms.
翻译:提供隐私保护一直是联邦学习联合会(FL)的主要动机之一。 最近,在将正式隐私概念纳入FL时,已经开展了一系列工作,将不同隐私的正式隐私概念纳入FL。为了保证FL算法中客户一级的隐私差异化,在增加隐私噪音之前,客户传送的模型更新必须剪贴,这种剪贴操作与中央化的私人系统差异化的梯度剪贴操作的相对应大不相同,而且没有很好地理解。在本文中,我们首先从经验上表明,剪贴的FedAvg在培训神经网络时,即使有大量数据差异性,也能发挥出奇特的很好的效果,这部分是由于客户对一些广受欢迎的深层结构的更新变得相似。基于这一关键观察,我们提供了对差异性私人(DP)FedAvg算法的趋同分析,并突出剪贴偏与客户最新消息的分发之间的关系。据我们所知,这是对FL算法中剪贴切操作的理论和经验问题进行严格调查的首项工作。