We address practical implementation of a risk-weighted pseudo posterior synthesizer for microdata dissemination with a new re-weighting strategy that maximizes utility of released synthetic data under at any level of formal privacy guarantee. Our re-weighting strategy applies to any vector-weighted pseudo posterior mechanism under which a vector of observation-indexed weights are used to downweight likelihood contributions for high disclosure risk records. We demonstrate our method on two different vector-weighted schemes that target high-risk records. Our new method for constructing record-indexed downeighting maximizes the data utility under any privacy budget for the vector-weighted synthesizers by adjusting the by-record weights, such that their individual Lipschitz bounds approach the bound for the entire database. Our method achieves an $(\epsilon = 2 \Delta_{\boldsymbol{\alpha}})-$asymptotic differential privacy (aDP) guarantee, globally, over the space of databases. We illustrate our methods using simulated highly skewed count data and compare the results to a scalar-weighted synthesizer under the Exponential Mechanism (EM). We also apply our methods to a sample of the Survey of Doctorate Recipients and demonstrate the practicality of our methods.
翻译:我们通过新的重新加权战略,在任何级别的正式隐私保障下,最大限度地利用释放的合成数据,在任何级别的正式隐私保障下,使释放的合成数据得到最大程度的利用; 我们的重新加权战略适用于任何病媒加权的伪后表层机制,在这种机制下,观测指数加权量的矢量被用于为高披露风险记录提供较低程度的概率贡献; 我们展示了我们针对高风险记录的两种不同矢量加权计划的方法; 我们的新构建记录指数下调8的方法,通过调整记录加权合成器的重量,使病媒加权合成器在任何隐私预算下调中的最大数据效用最大化,从而使其个人利普西茨将整个数据库的界限捆绑起来。 我们的方法是在全球范围,在数据库空间上,以两种不同的矢量加权制差异保障(aDP)为对象。 我们用模拟的高度扭曲的计数数据来说明我们使用的方法,并将数据与我们实验室测试方法的测试结果与我们实验室测试的测试方法相比。