Balancing privacy and accuracy is a major challenge in designing differentially private machine learning algorithms. To improve this tradeoff, prior work has looked at privacy amplification methods which analyze how common training operations such as iteration and subsampling the data can lead to higher privacy. In this paper, we analyze privacy amplification properties of a new operation, sampling from the posterior, that is used in Bayesian inference. In particular, we look at Bernoulli sampling from a posterior that is described by a differentially private parameter. We provide an algorithm to compute the amplification factor in this setting, and establish upper and lower bounds on this factor. Finally, we look at what happens when we draw k posterior samples instead of one.
翻译:平衡隐私和准确性是设计差别化的私人机器学习算法的一大挑战。 为改善这一权衡,先前的工作已经研究了隐私放大方法,这些方法分析重复和分抽样数据等常见培训操作如何能导致更高的隐私。 在本文中,我们分析了新操作的隐私放大特性,从Bayesian 推论中使用的后方取样。 特别是, 我们从以差别化私人参数描述的后方采集的Bernoulli取样。 我们提供了一个算法, 以计算此设置的放大系数, 并设定这个系数的上限和下限。 最后, 我们查看我们绘制 K postior 样本而不是一个参数时会发生什么情况 。