普世均等:作为不同隐私权评价衡量标准加以复制 (Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy)

Differential privacy mechanisms are increasingly used to enable public release of sensitive datasets, relying on strong theoretical guarantees for privacy coupled with empirical evidence of utility. Utility is typically measured as the error on representative proxy tasks, such as descriptive statistics, multivariate correlations, or classification accuracy. In this paper, we propose an alternative evaluation methodology for measuring the utility of differentially private synthetic data in scientific research, a measure we term "epistemic parity." Our methodology consists of reproducing empirical conclusions of peer-reviewed papers that use publicly available datasets, and comparing these conclusions to those based on differentially private versions of the datasets. We instantiate our methodology over a benchmark of recent peer-reviewed papers that analyze public datasets in the ICPSR social science repository. We reproduce visualizations (qualitative results) and statistical measures (quantitative results) from each paper. We then generate differentially private synthetic datasets using state-of-the-art mechanisms and assess whether the conclusions stated in the paper hold. We find that, across reasonable epsilon values, epistemic parity only partially holds for each synthesizer we evaluated. Therefore, we advocate for both improving existing synthesizers and creating new data release mechanisms that offer strong guarantees for epistemic parity while achieving risk-aware, best effort protection from privacy attacks.

翻译：不同隐私机制越来越多地被用来帮助公众发布敏感数据集。不同隐私机制越来越多地被用于帮助公众发布敏感数据集,依靠对隐私的强有力的理论保障以及实用性的经验证据。通用性通常被测量为代表性代用任务的错误,例如描述性统计、多变量相关关系或分类准确性。在本文中,我们提出了衡量不同私人合成数据在科学研究中的效用的替代评价方法,我们称之为“环境均等 ” 。我们的方法包括利用公开提供的数据集复制同行审议文件的经验性结论,并将这些结论与基于不同私人版本的数据集的结论进行比较。我们将我们的方法与最近经同行审查的文件的基准同步起来,这些文件分析比较了比较,这些文件分析了比较了比较方案社会科学存放处的公共数据集。我们从每份文件中转载了可视化数据(定性结果)和统计措施(定量结果)的替代评价方法。然后,我们利用州-艺术机制生成差异性私人合成数据集,评估文件中的结论是否成立。我们发现,在合理的普西隆价值中, 缩写等等均值,我们仅部分地将我们的方法用于分析经同行审查的文件基准,我们评估了每一种综合数据,同时提供现有风险。