Although deep learning has made remarkable progress in processing various types of data such as images, text and speech, they are known to be susceptible to adversarial perturbations: perturbations specifically designed and added to the input to make the target model produce erroneous output. Most of the existing studies on generating adversarial perturbations attempt to perturb the entire input indiscriminately. In this paper, we propose ExploreADV, a general and flexible adversarial attack system that is capable of modeling regional and imperceptible attacks, allowing users to explore various kinds of adversarial examples as needed. We adapt and combine two existing boundary attack methods, DeepFool and Brendel\&Bethge Attack, and propose a mask-constrained adversarial attack system, which generates minimal adversarial perturbations under the pixel-level constraints, namely ``mask-constraints''. We study different ways of generating such mask-constraints considering the variance and importance of the input features, and show that our adversarial attack system offers users good flexibility to focus on sub-regions of inputs, explore imperceptible perturbations and understand the vulnerability of pixels/regions to adversarial attacks. We demonstrate our system to be effective based on extensive experiments and user study.
翻译:虽然在处理图像、文字和言论等各类数据方面取得了显著的学习进展,但众所周知,这些数据在处理图像、文字和言论等各类数据方面取得了显著进展,但众所周知,它们很容易受到对抗性干扰:我们专门设计并添加了干扰,使目标模型产生错误产出;大多数关于产生对抗性扰动的现有研究试图不加区别地干扰整个输入;在本文件中,我们提议探索ADV,这是一个一般和灵活的对抗性攻击系统,能够模拟区域和不可察觉的攻击,使用户能够根据需要探索各种敌对性攻击的例子;我们调整和合并了两种现有的边界攻击方法,即DeepFool和Brendel ⁇ Bethge攻击,并提出一个受蒙蔽限制的对抗性攻击系统,在像素级限制(即“mask-constraints”)下产生最小的对抗性扰动干扰。我们考虑到输入性特征的差异和重要性,我们研究了产生这种遮蔽性攻击的不同方法,并表明我们的对抗性攻击系统为用户提供了良好的灵活性,以便集中注意投入的次区域,探索无法想象的对抗性攻击,并了解我们区域进行有效的攻击的弱点。