Selecting the top-$k$ highest scoring items under differential privacy (DP) is a fundamental task with many applications. This work presents three new results. First, the exponential mechanism, permute-and-flip and report-noisy-max, as well as their oneshot variants, are unified into the Lipschitz mechanism, an additive noise mechanism with a single DP-proof via a mandated Lipschitz property for the noise distribution. Second, this new generalized mechanism is paired with a canonical loss function to obtain the canonical Lipschitz mechanism, which can directly select k-subsets out of $d$ items in $O(dk+d \log d)$ time. The canonical loss function assesses subsets by how many users must change for the subset to become top-$k$. Third, this composition-free approach to subset selection improves utility guarantees by an $\Omega(\log k)$ factor compared to one-by-one selection via sequential composition, and our experiments on synthetic and real-world data indicate substantial utility improvements.
翻译:选择不同隐私(DP)下的最高分数项目是一项基本任务。 这项工作提出了三项新结果。 首先, 指数机制, permute- flip 和 report- noisy- max, 以及它们的单发变体, 被统一到Lipschitz 机制中, 一个通过授权的 Lipschitz 属性, 使用单一的 DP 防爆的附加噪音机制, 用于噪音分布。 其次, 这一新的普遍化机制与一个卡通性损失函数相配, 以获得 Canonical Lipschitz 机制, 该机制可以直接从$O ( dk+d\log) d) 的美元项目中选择 k subsets 。 罐头损失函数评估子集, 取决于有多少用户必须改变子集, 才能成为顶价 $。 第三, 与子集选择的不包含成成分的方法相比, 以 $\ Omega (\log k) 倍的因数改善公用事业保障,, 与通过序列组成逐一选取的因数, 我们在合成和现实世界数据的实验显示有相当的效用改进。