We propose the approach of model-based differentially private synthesis (modips) in the Bayesian framework for releasing individual-level surrogate/synthetic datasets with privacy guarantees given the original data. The modips technique integrates the concept of differential privacy into model-based data synthesis. We introduce several variants for the general modips approach and different procedures to obtaining privacy-preserving posterior samples, a key step in modips. The uncertainty from the sanitization and synthetic process in modips can be accounted for by releasing multiple synthetic datasets and quantified via an inferential combination rule that is proposed in this paper. We run empirical studies to examine the impacts of the number of synthetic sets and the privacy budget allocation schemes on the inference based on synthetic data.
翻译:我们提议在巴伊西亚框架中采用基于模型的、有区别的私人合成(modips)方法,释放具有原始数据的隐私保障的个体级代用/合成数据集;modips技术将差异隐私概念纳入基于模型的数据合成;我们为一般modips方法和不同程序引入了几种变式,以获取隐私保护后附体样本,这是modips中的一个关键步骤;在modips中,清洁和合成过程的不确定性可以通过释放多个合成数据集和通过本文提出的推断合并规则量化来加以计算;我们进行了经验研究,以审查合成数据集数量和隐私预算分配计划对基于合成数据的推断的影响。