Model extraction attacks are a kind of attacks where an adversary obtains a machine learning model whose performance is comparable with one of the victim model through queries and their results. This paper presents a novel model extraction attack, named TEMPEST, applicable on tabular data under a practical data-free setting. Whereas model extraction is more challenging on tabular data due to normalization, TEMPEST no longer needs initial samples that previous attacks require; instead, it makes use of publicly available statistics to generate query samples. Experiments show that our attack can achieve the same level of performance as the previous attacks. Moreover, we identify that the use of mean and variance as statistics for query generation and the use of the same normalization process as the victim model can improve the performance of our attack. We also discuss a possibility whereby TEMPEST is executed in the real world through an experiment with a medical diagnosis dataset. We plan to release the source code for reproducibility and a reference to subsequent works.
翻译:模型抽取攻击是一种攻击,敌人在这种攻击中通过查询和结果获得一种机器学习模型,该模型的性能与受害者模型的性能可与受害者模型的性能相比。本文介绍了一种新型的模型抽取攻击,称为TEMPEST,在实际数据无损环境下适用于表格数据。虽然模型抽取在表格数据中更具挑战性,但由于正常化,TEMPEST不再需要以往攻击所要求的初步样本;相反,它利用公开可得的统计数据来生成查询样品。实验表明,我们的攻击可以达到与以往攻击相同的性能水平。此外,我们确认,使用中值和差异作为生成查询的统计,以及使用与受害者模型相同的正常化程序,可以改善我们攻击的性能。我们还讨论了一种可能性,即通过医学诊断数据集的实验,在现实世界中执行这种抽取模型。我们计划发布源代码,以便重新显示和引用随后的作品。