Universal adversarial attacks, which hinder most deep neural network (DNN) tasks using only a small single perturbation called a universal adversarial perturbation (UAP), is a realistic security threat to the practical application of a DNN. In particular, such attacks cause serious problems in medical imaging. Given that computer-based systems are generally operated under a black-box condition in which only queries on inputs are allowed and outputs are accessible, the impact of UAPs seems to be limited because well-used algorithms for generating UAPs are limited to a white-box condition in which adversaries can access the model weights and loss gradients. Nevertheless, we demonstrate that UAPs are easily generatable using a relatively small dataset under black-box conditions. In particular, we propose a method for generating UAPs using a simple hill-climbing search based only on DNN outputs and demonstrate the validity of the proposed method using representative DNN-based medical image classifications. Black-box UAPs can be used to conduct both non-targeted and targeted attacks. Overall, the black-box UAPs showed high attack success rates (40% to 90%), although some of them had relatively low success rates because the method only utilizes limited information to generate UAPs. The vulnerability of black-box UAPs was observed in several model architectures. The results indicate that adversaries can also generate UAPs through a simple procedure under the black-box condition to foil or control DNN-based medical image diagnoses, and that UAPs are a more realistic security threat.
翻译:通用对抗性攻击阻碍着大多数最深的神经网络(DNN)任务,只使用一个叫通用对抗性干扰(UAP)的小小的单一扰动,这是对DNN实际应用的现实安全威胁。特别是,这种攻击在医疗成像方面造成严重问题。鉴于计算机系统一般在黑箱条件下运行,只允许对投入进行查询,而且产出可以获取,因此统一对抗性攻击的影响似乎有限,因为用于产生UAP的常用算法仅限于一种白箱状态,使对手能够利用模型重量和损失梯度。然而,我们证明在黑箱条件下,使用相对较小的数据集很容易产生新式攻击。特别是,我们提出一种方法,利用仅基于DNN产出的简单山坡扫描搜索,来生成新式系统,并展示拟议方法的有效性,使用具有代表性的DNNNW的医疗图像分类。基于黑箱的UAP可以用来进行非目标性和有针对性的攻击。总体而言,黑箱 UAP显示高攻击成功率(40 % ),在黑箱中也显示高攻击率,因为通过一些标准成功率,通过一些标准生成了标准。