Many differentially private algorithms for answering database queries involve a step that reconstructs a discrete data distribution from noisy measurements. This provides consistent query answers and reduces error, but often requires space that grows exponentially with dimension. Private-PGM is a recent approach that uses graphical models to represent the data distribution, with complexity proportional to that of exact marginal inference in a graphical model with structure determined by the co-occurrence of variables in the noisy measurements. Private-PGM is highly scalable for sparse measurements, but may fail to run in high dimensions with dense measurements. We overcome the main scalability limitation of Private-PGM through a principled approach that relaxes consistency constraints in the estimation objective. Our new approach works with many existing private query answering algorithms and improves scalability or accuracy with no privacy cost.
翻译:许多用于回答数据库查询的有差别的私人算法都包含一个步骤,该步骤从噪音测量中重建了离散的数据分布。它提供了一致的查询答案并减少了错误,但往往需要随着尺寸的倍增而增长的空间。私人PGM是最近的一种方法,它使用图形模型来代表数据分布,其复杂性与图形模型中精确边际推论的复杂度成比例,图形模型的结构是由杂乱测量中变量共同发生的结构所决定的。私人PGM对于稀少的测量而言可高度缩放,但可能无法在高维度上进行密集测量。我们通过原则性方法,减轻估算目标中的一致性限制,克服了私人PGM的主要可缩放限制。我们的新方法与许多现有的私人解答算法合作,并在没有隐私成本的情况下改进可缩放性或准确性。