数据增强 MCMC 用于从私有化数据中推断出贝叶斯人的数据 (Data Augmentation MCMC for Bayesian Inference from Privatized Data)

Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. We give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a na\"ive-Bayes log-linear model as well as on a linear regression model.

翻译：限制对私营数据进行有效的统计推断。具体地说,私有化数据的可能性功能要求对大量机密数据库进行整合,并且通常是难以解决的。对于巴伊西亚人的分析,这导致后方分布加倍难以解决,传统的MCMC技术无法适用。我们提议一个MCMC框架,以便从私有化数据中进行巴伊西亚推理,这种推论适用于广泛的统计模型和隐私机制。我们的MCMC算法用未观测的机密数据来补充模型参数,并互相更新。关于更新机密数据的潜在挑战性步骤,我们提出一种通用方法,利用该机制的隐私保障来确保效率。我们给出计算的复杂性、接受率和混合MMC特性方面的结果。我们要说明我们的方法在“na\ive-Bayes-loglinear模型”上的功效和适用性,以及线性回归模型。