Differential Privacy protects individuals' data when statistical queries are published from aggregated databases: applying "obfuscating" mechanisms to the query results makes the released information less specific but, unavoidably, also decreases its utility. Yet it has been shown that for discrete data (e.g. counting queries), a mandated degree of privacy and a reasonable interpretation of loss of utility, the Geometric obfuscating mechanism is optimal: it loses as little utility as possible. For continuous query results however (e.g. real numbers) the optimality result does not hold. Our contribution here is to show that optimality is regained by using the Laplace mechanism for the obfuscation. The technical apparatus involved includes the earlier discrete result by Ghosh et al., recent work on abstract channels and their geometric representation as hyper-distributions, and the dual interpretations of distance between distributions provided by the Kantorovich-Rubinstein Theorem.
翻译:当从综合数据库公布统计查询时,不同的隐私保护个人的数据:在查询结果中应用“模糊”机制,使得发布的信息不那么具体,但不可避免地会降低其效用。然而,已经表明,对于离散数据(例如计数查询)、规定的隐私程度和对效用损失的合理解释,几何模糊机制是最佳的:尽可能少失去效用。然而,对于连续查询的结果(例如实际数字),最佳性结果是站不住脚的。我们在这里的贡献是表明,通过使用Laplace机制进行混淆,可以重新取得最佳性。所涉及的技术设备包括Ghosh等人的早期离散结果、最近关于抽象渠道的工作及其几何表现为超分布,以及Kantorovich-Rubinstein Theorem提供的分布距离的双重解释。