Federated learning (FL) empowers distributed clients to collaboratively train a shared machine learning model through exchanging parameter information. Despite the fact that FL can protect clients' raw data, malicious users can still crack original data with disclosed parameters. To amend this flaw, differential privacy (DP) is incorporated into FL clients to disturb original parameters, which however can significantly impair the accuracy of the trained model. In this work, we study a crucial question which has been vastly overlooked by existing works: what are the optimal numbers of queries and replies in FL with DP so that the final model accuracy is maximized. In FL, the parameter server (PS) needs to query participating clients for multiple global iterations to complete training. Each client responds a query from the PS by conducting a local iteration. Our work investigates how many times the PS should query clients and how many times each client should reply the PS. We investigate two most extensively used DP mechanisms (i.e., the Laplace mechanism and Gaussian mechanisms). Through conducting convergence rate analysis, we can determine the optimal numbers of queries and replies in FL with DP so that the final model accuracy can be maximized. Finally, extensive experiments are conducted with publicly available datasets: MNIST and FEMNIST, to verify our analysis and the results demonstrate that properly setting the numbers of queries and replies can significantly improve the final model accuracy in FL with DP.
翻译:联邦学习(FL) 授权分布的客户通过交换参数信息,合作培训共享的机器学习模式。尽管FL能够保护客户的原始数据,恶意用户仍然可以用披露的参数破解原始数据。为了修正这一缺陷,差异隐私(DP)被融入FL客户,以扰乱原始参数,但这会大大损害经过培训的模式的准确性。在这项工作中,我们研究一个被现有工作广泛忽视的关键问题:FL与DP的查询和答复的最佳数量,以便最大限度地实现最后模型的准确性。在FL中,参数服务器(PS)需要向参与客户查询多个全球版本以完成培训。每个客户都通过进行本地版本的重复来回答来自PS的询问。我们的工作调查PS应询问客户多少次,每个客户应答复PS的准确性有多少次。我们调查了两个最广泛使用的DP机制(即模型Laplace机制和Gossian机制),通过进行统一率分析,我们可以确定FL的查询和答复的最佳数量,以便DP的查询和答复最优化的查询和答复。最后模型能够对FMIS的准确性作出最大程度的精确性分析。最后数据进行。最后的核查。最后数据。