We address the problem of efficiently and effectively answering large numbers of queries on a sensitive dataset while ensuring differential privacy (DP). We separately analyze this problem in two distinct settings, grounding our work in a state-of-the-art DP mechanism for large-scale query answering: the Relaxed Adaptive Projection (RAP) mechanism. The first setting is a classic setting in DP literature where all queries are known to the mechanism in advance. Within this setting, we identify challenges in the RAP mechanism's original analysis, then overcome them with an enhanced implementation and analysis. We then extend the capabilities of the RAP mechanism to be able to answer a more general and powerful class of queries (r-of-k thresholds) than previously considered. Empirically evaluating this class, we find that the mechanism is able to answer orders of magnitude larger sets of queries than prior works, and does so quickly and with high utility. We then define a second setting motivated by real-world considerations and whose definition is inspired by work in the field of machine learning. In this new setting, a mechanism is only given partial knowledge of queries that will be posed in the future, and it is expected to answer these future-posed queries with high utility. We formally define this setting and how to measure a mechanism's utility within it. We then comprehensively empirically evaluate the RAP mechanism's utility within this new setting. From this evaluation, we find that even with weak partial knowledge of the future queries that will be posed, the mechanism is able to efficiently and effectively answer arbitrary queries posed in the future. Taken together, the results from these two settings advance the state of the art on differentially private large-scale query answering.
翻译:我们在两种不同的环境下分别分析这一问题,将我们的工作置于一个最先进的大规模查询解答的DP机制之下:放松的适应性预测(RAP)机制。第一个环境是DP文献中的一个典型环境,所有查询都是事先了解该机制的。在这个环境中,我们找出区域行动方案机制最初分析中的挑战,然后通过强化执行和分析克服这些挑战。然后,我们扩大区域行动方案机制的能力,以便能够回答比以前考虑的更普遍和强大的一类查询(r-k阈值),从而将我们的工作置于大规模查询的最先进的DP机制之下:放松的适应性预测(RAP)机制。第一个环境是DP文献中的一个典型的环境,所有查询都是事先了解该机制的。然后,我们根据现实世界的考虑来定义第二个环境,其定义将受到机器学习领域工作的启发。在这个新环境中,一个机制只能部分了解今后将提出的查询(r-k阈值阈值阈值),并期望这个机制能够有效地回答比以前工作规模更大的询问更多的问题,我们今后提出的这种经验性查询将如何在评估中进行。