We consider training models on private data that is distributed across user devices. To ensure privacy, we add on-device noise and use secure aggregation so that only the noisy sum is revealed to the server. We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. We provide a novel privacy analysis for sums of discrete Gaussians. We also analyze the effect of rounding the input data and the modular summation arithmetic. Our theoretical guarantees highlight the complex tension between communication, privacy, and accuracy. Our extensive experimental results demonstrate that our solution is essentially able to achieve a comparable accuracy to central differential privacy with 16 bits of precision per value.
翻译:我们考虑在用户设备之间分布的私人数据培训模式。为了保证隐私,我们增加设备上的噪音,并使用安全的聚合,只有噪音才会向服务器披露。我们提出了一个全面的端对端系统,在进行安全聚合之前,将数据适当分解,并增加离散高斯噪音。我们为离散高斯人提供了新的隐私分析。我们还分析了输入数据四舍五入的效果和模块加和算术。我们的理论保证突出了通信、隐私和准确性之间的复杂紧张。我们广泛的实验结果表明,我们的解决方案基本上能够达到与中央差异隐私相当的精确度,每个价值有16位精确度。