Machine Learning (ML) algorithms are susceptible to adversarial attacks and deception both during training and deployment. Automatic reverse engineering of the toolchains behind these adversarial machine learning attacks will aid in recovering the tools and processes used in these attacks. In this paper, we present two techniques that support automated identification and attribution of adversarial ML attack toolchains using Co-occurrence Pixel statistics and Laplacian Residuals. Our experiments show that the proposed techniques can identify parameters used to generate adversarial samples. To the best of our knowledge, this is the first approach to attribute gradient based adversarial attacks and estimate their parameters. Source code and data is available at: https://github.com/michael-goebel/ei_red
翻译:在培训和部署期间,机器学习(ML)算法都容易受到对抗性攻击和欺骗;这些对抗性机器学习(ML)算法背后的工具链的自动反向工程将有助于恢复这些攻击中使用的工具和程序;在本文中,我们介绍了两种技术,支持利用共同碰撞像素统计数据和Laplacian残余物自动识别和确定对抗性ML攻击工具链的归属。我们的实验表明,拟议的技术可以确定用来生成对抗性样品的参数。据我们所知,这是确定基于梯度的对抗性攻击和估计其参数的第一个方法。资料来源代码和数据见:https://github.com/michael-goebel/ei_red。