Despite our best efforts, deep learning models remain highly vulnerable to even tiny adversarial perturbations applied to the inputs. The ability to extract information from solely the output of a machine learning model to craft adversarial perturbations to black-box models is a practical threat against real-world systems, such as autonomous cars or machine learning models exposed as a service (MLaaS). Of particular interest are sparse attacks. The realization of sparse attacks in black-box models demonstrates that machine learning models are more vulnerable than we believe. Because these attacks aim to minimize the number of perturbed pixels measured by l_0 norm-required to mislead a model by solely observing the decision (the predicted label) returned to a model query; the so-called decision-based attack setting. But, such an attack leads to an NP-hard optimization problem. We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers. Notably, vision transformers are yet to be investigated under a decision-based attack setting. SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks. The attack algorithm, although conceptually simple, is also competitive with only a limited query budget against the state-of-the-art gradient-based whitebox attacks in standard computer vision tasks such as ImageNet. Importantly, the query efficient SparseEvo, along with decision-based attacks, in general, raise new questions regarding the safety of deployed systems and poses new directions to study and understand the robustness of machine learning models.
翻译:摘要:尽管我们做出了最大的努力,深度学习模型仍然容易受到应用于输入的微小对抗性扰动的影响。只利用机器学习模型的输出信息来设计黑盒模型的对抗性扰动是现实世界中的一个实际威胁,比如自动驾驶车辆或暴露为服务的机器学习模型(MLaaS)。 特别关注的是稀疏攻击,因为它们旨在最小化扰动像素的数量,l_0范数量度,从而通过观察仅返回给模型查询的决策(预测标签)来误导模型;也被称为“基于决策的攻击”。但是,这种攻击导致一个NP难的优化问题。 我们为该问题开发了一种演化算法SparseEvo,并评估了卷积深度神经网络和视觉变换器。 值得注意的是,在基于决策的攻击设置下,视觉变换器尚未得到研究。SparseEvo对于无目标和有目标的攻击的模型查询数量远远少于现有稀疏攻击Pointwise的数量。攻击算法虽然在概念上很简单,但在仅有有限查询预算的情况下,也可以与标准计算机视觉任务(如ImageNet)中的最先进的基于梯度的白盒攻击相抗衡。 重要的是,Query Efficient SparseEvo和基于决策的攻击,提出了有关已部署系统的安全性的新问题,并提出了研究和理解机器学习模型鲁棒性的新方向。