Federated Learning (FL) is a promising machine learning paradigm that enables the analyzer to train a model without collecting users' raw data. To ensure users' privacy, differentially private federated learning has been intensively studied. The existing works are mainly based on the \textit{curator model} or \textit{local model} of differential privacy. However, both of them have pros and cons. The curator model allows greater accuracy but requires a trusted analyzer. In the local model where users randomize local data before sending them to the analyzer, a trusted analyzer is not required but the accuracy is limited. In this work, by leveraging the \textit{privacy amplification} effect in the recently proposed shuffle model of differential privacy, we achieve the best of two worlds, i.e., accuracy in the curator model and strong privacy without relying on any trusted party. We first propose an FL framework in the shuffle model and a simple protocol (SS-Simple) extended from existing work. We find that SS-Simple only provides an insufficient privacy amplification effect in FL since the dimension of the model parameter is quite large. To solve this challenge, we propose an enhanced protocol (SS-Double) to increase the privacy amplification effect by subsampling. Furthermore, for boosting the utility when the model size is greater than the user population, we propose an advanced protocol (SS-Topk) with gradient sparsification techniques. We also provide theoretical analysis and numerical evaluations of the privacy amplification of the proposed protocols. Experiments on real-world dataset validate that SS-Topk improves the testing accuracy by 60.7\% than the local model based FL.
翻译:联邦学习( FL) 是一个很有希望的机器学习模式, 使分析器能够在不收集用户原始数据的情况下训练模型。 为了确保用户隐私, 已经对不同的私人联合学习进行了深入的研究。 现有的作品主要基于差异隐私的 suffletit{curater 模型} 或\ textit{ local 模型。 但是, 两者都有利弊。 管理员模式允许更高的准确性, 但需要一个可靠的分析器。 在本地模型中, 用户在将本地数据随机地发送到分析器之前, 不需要一个可信赖的分析器, 但准确性是有限的。 在这项工作中, 利用最近提出的差异隐私的 Shextit{ privacy application 模型, 我们实现了两个世界的最佳, 即: 校正模型的精度和强的隐私。 我们先在将本地数据随机模型和简单协议( SS- Simpload) 扩展后, 我们发现 SS- sloyal 只能提供一个不够的隐私缩缩缩缩缩的缩缩图, 也提议了这个系统化的缩略图的缩缩缩图。