Polls are a common way of collecting data, including product reviews and feedback forms. However, few data collectors give upfront privacy guarantees. Additionally, when privacy guarantees are given upfront, they are often vague claims about 'anonymity'. Instead, we propose giving quantifiable privacy guarantees through the statistical notion of differential privacy. Nevertheless, privacy does not come for free. At the heart of differential privacy lies an inherent trade-off between accuracy and privacy that needs to be balanced. Thus, it is vital to properly adjust the accuracy-privacy trade-off before setting out to collect data. Altogether, getting started with differentially private data collection can be challenging. Ideally, a data analyst should not have to be concerned about all the details of differential privacy, but rather get differential privacy by design. Still, to the best of our knowledge, no tools for gathering poll data under differential privacy exists. Motivated by the lack of tools to gather poll data under differential privacy, we set out to engineer our own tool. Specifically, to make local differential privacy accessible for all, in this systems paper we present Randori, a set of novel open source tools for differentially private poll data collection. Randori is intended to help data analysts keep their focus on what data their poll is collecting, as opposed to how they should collect it. Our tools also allow the data analysts to analytically predict the accuracy of their poll. Furthermore, we show that differential privacy alone is not enough to achieve end-to-end privacy in a server-client setting. Consequently, we also investigate and mitigate implicit data leaks in Randori.
翻译:包括产品审查和反馈表格在内的不同隐私是收集数据的一种常见方式。 但是,很少的数据收集者会提供先期隐私保障。 此外,当隐私保障被事先提供时,它们往往被模糊地指称为“匿名性 ” 。 相反,我们提议通过不同隐私的统计概念提供可量化的隐私保障。 然而,隐私并不是免费的。 差异隐私的核心在于准确性和隐私之间的内在权衡,这需要平衡。 因此,在开始收集数据之前,必须适当调整准确性隐私交易。 总的来说,从差异性私人数据收集开始,可能会具有挑战性。 理想的情况是,数据分析员不必关注差异性隐私的所有细节,而是通过设计获得不同的隐私。 然而,据我们所知,没有在差异性隐私下收集民意数据的工具。 由于缺乏在差异性隐私权下收集民意数据的工具,我们开始设计自己的工具。 具体地说,让所有人可以使用本地差异性隐私。 我们向Randori提交一套新的公开源码工具,用来调查差异性隐私权的所有细节工具不应该被用来调查隐私问题。