We consider training models on private data that are distributed across user devices. To ensure privacy, we add on-device noise and use secure aggregation so that only the noisy sum is revealed to the server. We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. We provide a novel privacy analysis for sums of discrete Gaussians and carefully analyze the effects of data quantization and modular summation arithmetic. Our theoretical guarantees highlight the complex tension between communication, privacy, and accuracy. Our extensive experimental results demonstrate that our solution is essentially able to match the accuracy to central differential privacy with less than 16 bits of precision per value.
翻译:我们考虑在用户设备之间分布的私人数据培训模式。 为了保证隐私,我们添加了设备上的噪音,并使用安全聚合,这样服务器才能发现噪音。我们提出了一个全面的端对端系统,在进行安全聚合之前,将数据适当分解并增加离散高斯噪音。我们为离散高斯人提供了新的隐私分析,并仔细分析了数据量化和模块和组合计算的效果。我们的理论保证强调了通信、隐私和准确性之间的复杂紧张关系。我们广泛的实验结果表明,我们的解决方案基本上能够与中央差异隐私的准确性相匹配,每个价值的精确度不到16位。