Federated data analytics is a framework for distributed data analysis where a server compiles noisy responses from a group of distributed low-bandwidth user devices to estimate aggregate statistics. Two major challenges in this framework are privacy, since user data is often sensitive, and compression, since the user devices have low network bandwidth. Prior work has addressed these challenges separately by combining standard compression algorithms with known privacy mechanisms. In this work, we take a holistic look at the problem and design a family of privacy-aware compression mechanisms that work for any given communication budget. We first propose a mechanism for transmitting a single real number that has optimal variance under certain conditions. We then show how to extend it to metric differential privacy for location privacy use-cases, as well as vectors, for application to federated learning. Our experiments illustrate that our mechanism can lead to better utility vs. compression trade-offs for the same privacy loss in a number of settings.
翻译:联邦数据分析是一个分布式数据分析框架,一个服务器汇集一组分布式低带宽用户装置的噪音反应,以估计汇总统计数据。这个框架的两大挑战是隐私,因为用户数据往往敏感,压缩,因为用户设备具有低网络带宽。先前的工作通过将标准压缩算法与已知的隐私机制相结合,分别应对这些挑战。在这项工作中,我们全面审视问题,设计一个适合任何特定通信预算的隐私意识压缩机制家庭。我们首先提议一个机制,用于传递一个在特定条件下有最佳差异的单一真实数字。然后我们展示如何将它扩大到用于定位隐私使用案例和矢量的量化隐私差异,以用于封存学习。我们的实验表明,我们的机制可以导致更好的效用,同时压缩一些环境中的相同隐私损失的权衡。