Differential privacy (DP) enables private data analysis but is difficult to use in practice. In a typical DP deployment, data controllers manage individuals' sensitive data and are responsible for answering data analysts' queries while protecting individuals' privacy; they do so by choosing $\epsilon$, the privacy loss budget, which controls how much noise to add to the query output. However, it is challenging for data controllers to choose $\epsilon$ because of the difficulty of interpreting the privacy implications of such a choice on the individuals they wish to protect. To address this challenge, we first derive a privacy risk indicator (PRI) directly from the definition of ex-post per-instance privacy loss in the DP literature. The PRI indicates the impact of choosing $\epsilon$ on individuals' privacy. We then leverage the PRI to design an algorithm to choose $\epsilon$ and release query output based on data controllers' privacy preferences. We design a modification of the algorithm that allows releasing both the query output and $\epsilon$ while satisfying differential privacy, and we propose a solution that bounds the total privacy loss when using the algorithm to answer multiple queries without requiring controllers to set the total privacy loss budget. We demonstrate our contributions through an IRB-approved user study and experimental evaluations that show the PRI is useful for helping controllers choose $\epsilon$ and our algorithms are efficient. Overall, our work contributes to making DP easier to use for controllers by lowering adoption barriers.
翻译:暂无翻译