Membership inference (MI) attack is currently the most popular test for measuring privacy leakage in machine learning models. Given a machine learning model, a data point and some auxiliary information, the goal of an MI attack is to determine whether the data point was used to train the model. In this work, we study the reliability of membership inference attacks in practice. Specifically, we show that a model owner can plausibly refute the result of a membership inference test on a data point $x$ by constructing a proof of repudiation that proves that the model was trained without $x$. We design efficient algorithms to construct proofs of repudiation for all data points of the training dataset. Our empirical evaluation demonstrates the practical feasibility of our algorithm by constructing proofs of repudiation for popular machine learning models on MNIST and CIFAR-10. Consequently, our results call for a re-evaluation of the implications of membership inference attacks in practice.
翻译:成员推论(MI)攻击目前是衡量机器学习模型中隐私泄漏的最受欢迎的检验标准。根据机器学习模型、数据点和一些辅助信息,MI攻击的目标是确定数据点是否用于培训模型。在这项工作中,我们研究了实际中成员推论攻击的可靠性。具体地说,我们表明,模型拥有者可以通过建立休妻证明,证明模型没有经过培训,从而合理驳斥数据点x美元的数据推论结果。我们设计了有效的算法,为培训数据集的所有数据点建立休妻证明。我们的经验评估表明,我们算法的实际可行性是,为MNIST和CIFAR-10的流行机器学习模型建立休妻证明。因此,我们的结果要求重新评价会员推论攻击的实际影响。</s>