Key-value data is a naturally occurring data type that has not been thoroughly investigated in the local trust model. Existing local differentially private (LDP) solutions for computing statistics over key-value data suffer from the inherent accuracy limitations of each user adding their own noise. Multi-party computation (MPC) maintains better accuracy than LDP and similarly does not require a trusted central party. However, naively applying MPC to key-value data results in prohibitively expensive computation costs. In this work, we present selective multi-party computation, a novel approach to distributed computation that leverages DP leakage to efficiently and accurately compute statistics over key-value data. By providing each party with a view of a random subset of the data, we can capture subtractive noise. We prove that our protocol satisfies pure DP and is provably secure in the combined DP/MPC model. Our empirical evaluation demonstrates that we can compute statistics over 10,000 keys in 20 seconds and can scale up to 30 servers while obtaining results for a single key in under a second.
翻译:关键值数据是一种自然产生的数据类型,尚未在当地信托模式中进行彻底调查。当前用于计算关键值数据的统计数据的本地私人(LDP)解决方案因每个用户增加自己的噪音的内在准确性限制而存在差异。多党计算(MPC)比LDP更准确,同样不需要一个信任的中央方。然而,对关键值数据应用MPC天真无邪地造成了过高的昂贵计算成本。在这项工作中,我们提出了选择性的多党计算方法,一种将DP渗漏用于有效和准确地计算关键值数据的数据的分布计算新办法。通过向每个缔约方提供随机的一组数据,我们可以捕捉到减色噪音。我们证明我们的协议满足了纯DP,并且在合并的DP/MPC模型中非常可靠。我们的经验评估表明,我们可以在20秒内计算超过10,000个键的统计数据,并在第二秒内获得单一键的结果的同时,将存储到30个服务器。