In spite that Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively, recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. In the meanwhile, Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks. However, the adoption of DP by clients in FL may significantly jeopardize the model accuracy. It is still an open problem to understand the practicality of DP from a theoretic perspective. In this paper, we make the first attempt to understand the practicality of DP in FL through tuning the number of conducted iterations. Based on the FedAvg algorithm, we formally derive the convergence rate with DP noises in FL. Then, we theoretically derive: 1) the conditions for the DP based FedAvg to converge as the number of global iterations (GI) approaches infinity; 2) the method to set the number of local iterations (LI) to minimize the negative influence of DP noises. By further substituting the Laplace and Gaussian mechanisms into the derived convergence rate respectively, we show that: 3) The DP based FedAvg with the Laplace mechanism cannot converge, but the divergence rate can be effectively prohibited by setting the number of LIs with our method; 4) The learning error of the DP based FedAvg with the Gaussian mechanism can converge to a constant number finally if we use a fixed number of LIs per GI. To verify our theoretical findings, we conduct extensive experiments using two real-world datasets. The results not only validate our analysis results, but also provide useful guidelines on how to optimize model accuracy when incorporating DP into FL
翻译:尽管联邦学习联合会(FL)在对分布客户进行机器学习模式培训时以其隐私保护而著称,但最近的研究报告指出,天真的FL很容易受到梯度渗漏攻击;与此同时,差异隐私(DP)作为防范梯度渗漏攻击的有希望的反措施出现;然而,FL客户采用DP可能会大大损害模型的准确性;从理论角度理解DP的实用性仍是一个公开的问题;在本文件中,我们第一次试图通过调整已进行的迭接数来理解FL中DP的广泛实用性。根据FDAvg算法,我们正式得出与FL噪音的趋同率。然后,我们理论上得出:1)基于FDAvg的DP条件,以趋同于全球代号的模型数目;2)从理论角度来理解DP的实用性(LI)定义的实用性;我们通过进一步将Laplet和高比机制转换为衍生的趋同率,我们又根据FDA的精确性分析结果,我们无法用基于IMA的精确度机制最终将数据转换为LIA的数值。