Differential privacy (DP) is a formal notion for quantifying the privacy loss of algorithms. Algorithms in the central model of DP achieve high accuracy but make the strongest trust assumptions whereas those in the local DP model make the weakest trust assumptions but incur substantial accuracy loss. The shuffled DP model (Bittau et al., 2017; Erlingsson et al., 2019; Cheu et al., 2019) has recently emerged as a feasible middle ground between the central and local models, providing stronger trust assumptions than the former while promising higher accuracies than the latter. In this paper, we obtain practical communication-efficient algorithms in the shuffled DP model for two basic aggregation primitives used in machine learning: 1) binary summation, and 2) histograms over a moderate number of buckets. Our algorithms achieve accuracy that is arbitrarily close to that of central DP algorithms with an expected communication per user essentially matching what is needed without any privacy constraints! We demonstrate the practicality of our algorithms by experimentally comparing their performance to several widely-used protocols such as Randomized Response (Warner, 1965) and RAPPOR (Erlingsson et al., 2014).
翻译:差异隐私(DP)是量化算法隐私损失的一个正式概念。 DP中央模型中的算法具有很高的准确性,但做出了最强烈的信任假设,而当地DP模型中的算法则则做出了最弱的信任假设,但却造成了相当的准确性损失。 被打乱的DP模型(Bitatau等人,2017年;Erlingsson等人,2019年;Cheu等人,2019年)最近成为中央和地方模型之间可行的中间点,提供了比前者更强有力的信任假设,同时比后者更令人乐观。 在本文中,我们为机器学习中使用的两种基本集合原始组合的调控法模型获得了实用的通信效率算法:1)二进制加和2)对中量桶的直方图。我们的算法是任意接近中央DP算法的准确性,预期每个用户的通信基本上与需要的无任何隐私限制的通信相匹配! 我们通过实验性地将自己的算法与若干广泛使用的协议进行比较,例如随机反应(Warner,1965年)和RAPOR etaling al.