We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation, and then attempts to find the synthetic dataset that most closely matches the noisy answers. We use a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows us to use efficient ML optimization techniques and tooling. Rather than answering all queries up front, we make judicious use of our privacy budget by iteratively and adaptively finding queries for which our (relaxed) synthetic data has high error, and then repeating the projection. We perform extensive experimental evaluations across a range of parameters and datasets, and find that our method outperforms existing algorithms in many cases, especially when the privacy budget is small or the query class is large.
翻译:我们提议、实施和评价一种新的算法,以释放大量统计查询的答案,如$k$-way边际,但须有不同的隐私。我们的算法以适应性方式利用预测机制的持续放松,即利用简单的扰动来回答关于私人数据集的询问,然后试图找到最贴近噪音答案的合成数据集。我们使用连续放松合成数据集域的方法,使预测损失可以区分,并使我们能够使用高效的ML优化技术和工具。我们不是先回答所有查询,而是明智地利用我们的隐私预算,反复和适应性地查找我们(松绑的)合成数据有高度错误的查询,然后重复预测。我们从一系列参数和数据集中进行广泛的实验性评估,发现我们的方法在许多情况中超越了现有的算法,特别是当隐私预算很小或查询类别很大的时候。