Machine Learning-as-a-Service (MLaaS) has become a widespread paradigm, making even the most complex machine learning models available for clients via e.g. a pay-per-query principle. This allows users to avoid time-consuming processes of data collection, hyperparameter tuning, and model training. However, by giving their customers access to the (predictions of their) models, MLaaS providers endanger their intellectual property, such as sensitive training data, optimised hyperparameters, or learned model parameters. Adversaries can create a copy of the model with (almost) identical behavior using the the prediction labels only. While many variants of this attack have been described, only scattered defence strategies have been proposed, addressing isolated threats. This raises the necessity for a thorough systematisation of the field of model stealing, to arrive at a comprehensive understanding why these attacks are successful, and how they could be holistically defended against. We address this by categorising and comparing model stealing attacks, assessing their performance, and exploring corresponding defence techniques in different settings. We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence strategy based on the goal and available resources. Finally, we analyse which defences are rendered less effective by current attack strategies.
翻译:MLAAS供应商通过让客户接触(其)模型,危及其知识产权,例如敏感的培训数据、优化超分计或学习的模型参数。反向研究可以只用预测标签制作一个(最接近)相同行为的模型副本。虽然这次攻击的许多变式都得到了描述,但只提出了零散的防御战略,以应对孤立的威胁。这就使得用户能够避免数据收集、超参数调和示范培训等耗时的过程。这就使得用户能够彻底系统化模式盗窃领域的必要性,从而全面了解这些袭击的成功原因,以及如何全面防范这些袭击。我们通过对模型盗窃袭击进行分类和比较,评估其性能,并探索不同环境中的相应防御技术。我们建议对攻击和防御战略进行分类,最终通过分析当前攻击和防御战略,我们如何以当前攻击目标为基础,提供有效的防御战略。