数据增强 MCMC 用于从私有化数据中推断出贝叶斯人的数据 (Data Augmentation MCMC for Bayesian Inference from Privatized Data)

Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. In particular, we give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a na\"ive-Bayes log-linear model as well as on a linear regression model.

翻译：限制对私营数据进行有效的统计推断。具体地说,私有化数据的可能性功能要求对大量机密数据库进行整合,而且通常很难处理。对于巴伊西亚人的分析,这会导致后方分布加倍难以处理,传统MCMC技术无法适用。我们提议一个MCMC框架,从私有化数据中进行巴伊西亚推理,这种推理适用于广泛的统计模型和隐私机制。我们的MCMC算法用未观测的机密数据来补充模型参数,并互相更新。对于可能具有挑战性的更新机密数据的步骤,我们提出一种通用办法,利用该机制的隐私保障来确保效率。我们尤其提出了计算复杂性、接受率和混合MMC特性方面的结果。我们说明了我们的方法在“na\ve-Bayes-线性线性模型”以及线性模型上的效力和适用性。