Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model's training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific training data, can be obtained only after the tedious process of training the model and performing the attack, and are missing any measure of the confidence and unused potential power of the attack. In this paper, we theoretically analyze the maximum power of tracing attacks against high-dimensional graphical models, with the focus on Bayesian networks. We provide a tight upper bound on the power (true positive rate) of these attacks, with respect to their error (false positive rate), for a given model structure even before learning its parameters. As it should be, the bound is independent of the knowledge and algorithm of any specific attack. It can help in identifying which model structures leak more information, how adding new parameters to the model increases its privacy risk, and what can be gained by adding new data points to decrease the overall information leakage. It provides a measure of the potential leakage of a model given its structure, as a function of the model complexity and the size of the training set.
翻译:模型泄漏有关其培训数据的信息。 这使得袭击者能够推断出有关其培训成套材料的敏感信息, 特别是确定数据样本是否是模型培训集的一部分。 现有工作经验显示,这些成员对复杂的深层学习模型进行推断(追踪)攻击的可能性。 但是,攻击结果取决于具体的培训数据, 只有在培训模型和进行攻击的繁琐过程之后才能获得, 并且缺少任何关于攻击的可信度和未使用的潜在力量的度量。 在本文中, 我们从理论上分析了对高维图形模型进行追踪攻击的最大能力, 重点是Bayesian网络。 我们对这些攻击的功率( 真正的正率) 提供了严格的上限, 涉及这些攻击的错误( 假正率 ), 对于特定的模型结构来说, 甚至在学习其参数之前, 攻击结果就取决于具体的培训数据 。 约束是独立于任何特定攻击的知识和算法 。 它有助于确定哪些模型结构泄漏更多的信息, 如何在模型中添加新的参数, 增加其隐私风险, 以及通过添加新的数据点来减少整个信息渗漏程度 。 它提供了一个衡量模型 。 它所设定的潜在渗漏程度 。