The privacy of machine learning models has become a significant concern in many emerging Machine-Learning-as-a-Service applications, where prediction services based on well-trained models are offered to users via pay-per-query. The lack of a defense mechanism can impose a high risk on the privacy of the server's model since an adversary could efficiently steal the model by querying only a few `good' data points. The interplay between a server's defense and an adversary's attack inevitably leads to an arms race dilemma, as commonly seen in Adversarial Machine Learning. To study the fundamental tradeoffs between model utility from a benign user's view and privacy from an adversary's view, we develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies. The developed concepts and theory match the empirical findings on the `equilibrium' between privacy and utility. In terms of optimization, the key ingredient that enables our results is a unified representation of the attack-defense problem as a min-max bi-level problem. The developed results will be demonstrated by examples and experiments.
翻译:机器学习模型的隐私已成为许多新兴机器-学习服务应用中一个重大关切问题,在这些应用中,根据经过良好训练的模型向用户提供逐项付费的预测服务,缺乏防御机制可能给服务器模型的隐私带来高风险,因为对手可以通过查询几个“好”数据点来有效窃取模型。服务器的防御与对手的攻击之间的相互作用不可避免地导致军备竞赛的两难处境,正如在反versarial机器学习中常见的那样。为了从友好用户的观点和对手的观点中研究模型效用与隐私之间的基本权衡,我们制定了新的衡量标准,以量化这种权衡,分析其理论性质,并开发一个优化的问题,以了解最佳的对抗性攻击和防御战略。所形成的概念和理论与关于隐私和功用之间的“平衡”的经验性结论相匹配。在优化方面,使我们能够取得成果的关键因素是将攻击-防卫问题统一地表述为微轴双层问题。所形成的结果将通过实例和实验加以证明。