Disclosure of data analytics has important scientific and commercial justifications. However, disclosure should not be allowed without due diligence investigation of the risks that it poses for information privacy of data subjects. Does the data analytics community have the right tools at their disposal to perform such due diligence? We present Privug, a way to explore leakage properties, or information privacy risks, involved with disclosing results of an analytics program. The method uses classical off-the-shelf tools for Bayesian probabilistic programming, exploiting the fact that they can reinterpret a regular program probabilistically. This in turn allows information-theoretic analysis of program behavior. These tools and skills are often available for a data scientist pondering disclosure questions. For privacy researchers, the method provides a fast and lightweight way to experiment with privacy protection measures and mechanisms. We demonstrate that Privug is accurate, scalable, and applicable, and use it to explore parameters of a differential privacy mechanism.
翻译:数据分析的披露具有重要的科学和商业理由。然而,在对数据主体的信息隐私构成的风险进行尽职调查之前,不应允许披露。数据分析界是否拥有适当的工具来进行这种尽职调查?我们介绍Privug,这是探索泄漏属性的一种方法,或信息隐私风险,涉及分析程序的结果的披露。这种方法使用典型的现成工具用于贝叶斯概率性编程,利用它们能够重新解释常规程序概率这一事实。这反过来又允许对程序行为进行信息理论分析。这些工具和技能往往可用于数据科学家思考披露问题。对于隐私研究人员来说,该方法提供了快速和轻量量的方法来试验隐私保护措施和机制。我们证明Priverug是准确、可扩展和适用的,并用来探索差异隐私机制的参数。