Recent research has shown that Deep Neural Networks (DNNs) are highly vulnerable to adversarial samples, which are highly transferable and can be used to attack other unknown black-box models. To improve the transferability of adversarial samples, several feature-based adversarial attack methods have been proposed to disrupt neuron activation in middle layers. However, current state-of-the-art feature-based attack methods typically require additional computation costs for estimating the importance of neurons. To address this challenge, we propose a Singular Value Decomposition (SVD)-based feature-level attack method. Our approach is inspired by the discovery that eigenvectors associated with the larger singular values decomposed from the middle layer features exhibit superior generalization and attention properties. Specifically, we conduct the attack by retaining the decomposed Top-1 singular value-associated feature for computing the output logits, which are then combined with the original logits to optimize adversarial perturbations. Our extensive experimental results verify the effectiveness of our proposed method, which significantly enhances the transferability of adversarial samples against various baseline models and defense strategies.The source code of this study is available at \href{https://anonymous.4open.science/r/SVD-SSA-13BF/README.md}.
翻译:近期的研究表明,深度神经网络(DNNs)极易受到对抗样本的攻击,这些攻击样本高度可转移性,可以用于攻击其他未知的黑匣子模型。为了提高对抗攻击样本的可转移性,在中间层中破坏神经元激活的基于特征的攻击方法已经被提出。然而,目前最先进的基于特征的攻击方法通常需要额外的计算成本来估计神经元的重要性。为了应对这个挑战,我们提出了一种基于奇异值分解(SVD)的特征级别攻击方法。我们的方法灵感来自于这样的发现:与从中间层特征分解出的较大奇异值相关的特征的特征向量表现出卓越的泛化和注意力属性。具体而言,我们通过保留分解的前1个奇异值关联的特征来进行攻击,用于计算输出逻辑,然后将其与原始逻辑相结合,以优化对抗扰动。我们广泛的实验结果验证了我们的方法的有效性,大大提高了对各种基线模型和防御策略的对抗样本的可转移性。该研究的源代码可以在\href{https://anonymous.4open.science/r/SVD-SSA-13BF/README.md}上找到。