反逆抗力:不同私人机器学习的下界 (Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning)

Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. DP formalizes this data leakage through a cryptographic game, where an adversary must predict if a model was trained on a dataset D, or a dataset D' that differs in just one example.If observing the training algorithm does not meaningfully increase the adversary's odds of successfully guessing which dataset the model was trained on, then the algorithm is said to be differentially private. Hence, the purpose of privacy analysis is to upper bound the probability that any adversary could successfully guess which dataset the model was trained on.In our paper, we instantiate this hypothetical adversary in order to establish lower bounds on the probability that this distinguishing game can be won. We use this adversary to evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.For DP-SGD, the most common method for training neural networks with differential privacy, our lower bounds are tight and match the theoretical upper bound. This implies that in order to prove better upper bounds, it will be necessary to make use of additional assumptions. Fortunately, we find that our attacks are significantly weaker when additional (realistic)restrictions are put in place on the adversary's capabilities.Thus, in the practical setting common to many real-world deployments, there is a gap between our lower bounds and the upper bounds provided by the analysis: differential privacy is conservative and adversaries may not be able to leak as much information as suggested by the theoretical bound.

翻译：不同的私人(DP) 机器学习让我们能够对私人数据进行模型培训,同时限制数据泄漏。 DP 通过加密游戏将数据泄漏正式化, 对手必须预测模型是否在数据集 D 上受过训练, 或在一个例子中有所不同的数据集 D。如果观察培训算法不会有意义地增加对手成功猜出模型是哪个数据集的概率, 那么算法据说是不同的私人的。因此, 隐私分析的目的是将任何对手都能够成功猜出模型是哪个数据集的概率放在上方。在我们的文件中, 我们即时利用这个假设对手, 以便确定这个区分游戏胜出的可能性的下限。我们用这个对手来评估在DP培训算法的隐私分析中允许的对抗能力的重要性。对于 DP- SGD 来说, 最常用的训练神经网络是不同的隐私, 我们的下限是紧密的, 符合理论上限的。这意味着为了证明哪个数据是更好的上限, 就必须使用额外的假设。幸运的是, 当我们的攻击在现实的高度上, 当现实的距离上, 我们的极限是更弱的时候, 。

相关内容

Machine Learning

关注 2239

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

专知会员服务

39+阅读 · 2020年11月3日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

专知会员服务

170+阅读 · 2020年5月10日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务