Federated learning (FL) is a technique that trains machine learning models from decentralized data sources. We study FL under local notions of privacy constraints, which provides strong protection against sensitive data disclosures via obfuscating the data before leaving the client. We identify two major concerns in designing practical privacy-preserving FL algorithms: communication efficiency and high-dimensional compatibility. We then develop a gradient-based learning algorithm called \emph{sqSGD} (selective quantized stochastic gradient descent) that addresses both concerns. The proposed algorithm is based on a novel privacy-preserving quantization scheme that uses a constant number of bits per dimension per client. Then we improve the base algorithm in three ways: first, we apply a gradient subsampling strategy that simultaneously offers better training performance and smaller communication costs under a fixed privacy budget. Secondly, we utilize randomized rotation as a preprocessing step to reduce quantization error. Thirdly, an adaptive gradient norm upper bound shrinkage strategy is adopted to improve accuracy and stabilize training. Finally, the practicality of the proposed framework is demonstrated on benchmark datasets. Experiment results show that sqSGD successfully learns large models like LeNet and ResNet with local privacy constraints. In addition, with fixed privacy and communication level, the performance of sqSGD significantly dominates that of various baseline algorithms.
翻译:联邦学习(FL) 是一种从分散的数据源中培养机器学习模型的技术。 我们根据当地隐私限制概念研究FL,通过在离开客户之前混淆数据,提供有力的保护,防止敏感数据披露。 我们在设计实用的隐私保存FL算法时发现两个主要问题:通信效率和高维兼容性。然后我们开发一种基于梯度的学习算法,称为 emph{sqSGD}(选择性四分制的随机梯度梯度下降),处理这两个问题。提议的算法基于一个新的隐私保留量化计划,每个客户使用固定的每维数位数。然后我们用三种方式改进基本算法:首先,我们采用梯度子标本战略,同时在固定的隐私预算下提供更好的培训绩效和较小的通信费用。第二,我们利用随机转换作为处理前一步,减少二次曲线错误。第三,采用了适应性梯度规范的上限缩略图策略,以提高准确性和稳定培训。最后,在基准数据集上展示了拟议框架的实际性。 实验结果显示,SqSGD 和一系列的保密性基准限制, 实验结果显示,如SqSGD 和Lealisalimalismalalismalalal destress delislates。