关于选择不同隐私构成的数据库 (On the Choice of Databases in Differential Privacy Composition)

Differential privacy (DP) is a widely applied paradigm for releasing data while maintaining user privacy. Its success is to a large part due to its composition property that guarantees privacy even in the case of multiple data releases. Consequently, composition has received a lot of attention from the research community: there exist several composition theorems for adversaries with different amounts of flexibility in their choice of mechanisms. But apart from mechanisms, the adversary can also choose the databases on which these mechanisms are invoked. The classic tool for analyzing the composition of DP mechanisms, the so-called composition experiment, neither allows for incorporating constraints on databases nor for different assumptions on the adversary's prior knowledge about database membership. We therefore propose a generalized composition experiment (GCE), which has this flexibility. We show that composition theorems that hold with respect to the classic composition experiment also hold with respect to the worst case of the GCE. This implies that existing composition theorems give a privacy guarantee for more cases than are explicitly covered by the classic composition experiment. Beyond these theoretical insights, we demonstrate two practical applications of the GCE: the first application is to give better privacy bounds in the presence of restrictions on the choice of databases; the second application is to reason about how the adversary's prior knowledge influences the privacy leakage. In this context, we show a connection between adversaries with an uninformative prior and subsampling, an important primitive in DP. To the best of our knowledge, this paper is the first to analyze the interplay between the databases in DP composition, and thereby gives both a better understanding of composition and practical tools for obtaining better composition bounds.

翻译：不同隐私(DP)是广泛应用的在维护用户隐私的同时释放数据的模式。它之所以成功,很大程度上是因为其构成属性保证了隐私,即使是在多次数据发布的情况下也是如此。因此,组成得到了研究界的极大关注:在选择机制时,对不同程度的灵活度不同的对手有几种构成理论。但除了机制外,对手也可以选择这些机制所援引的数据库。分析DP机制构成的经典工具,即所谓的构成实验,既不允许纳入数据库限制,也不允许对对手先前数据库成员资格知识的不同假设。因此,我们提议一个通用构成实验(GCE),它具有这种灵活性。我们表明,关于典型构成试验的构成的构成理论也符合其选择机制的最坏情况。这意味着,现有的构成为更多案例的隐私提供了保障,超出了典型的构成实验所明确覆盖的范围。除了这些理论的洞察外,我们展示了GCE的两个实际应用:第一个应用是给对手关于数据库成员资格的先前知识构成的隐私提供更好的限制,因此,在选择数据库之前的精确度上,第二个应用是我们对数据库的准确性进行更好的理解。