In the past decades, the rise of artificial intelligence has given us the capabilities to solve the most challenging problems in our day-to-day lives, such as cancer prediction and autonomous navigation. However, these applications might not be reliable if not secured against adversarial attacks. In addition, recent works demonstrated that some adversarial examples are transferable across different models. Therefore, it is crucial to avoid such transferability via robust models that resist adversarial manipulations. In this paper, we propose a feature randomization-based approach that resists eight adversarial attacks targeting deep learning models in the testing phase. Our novel approach consists of changing the training strategy in the target network classifier and selecting random feature samples. We consider the attacker with a Limited-Knowledge and Semi-Knowledge conditions to undertake the most prevalent types of adversarial attacks. We evaluate the robustness of our approach using the well-known UNSW-NB15 datasets that include realistic and synthetic attacks. Afterward, we demonstrate that our strategy outperforms the existing state-of-the-art approach, such as the Most Powerful Attack, which consists of fine-tuning the network model against specific adversarial attacks. Finally, our experimental results show that our methodology can secure the target network and resists adversarial attack transferability by over 60%.
翻译:在过去几十年中,人工智能的崛起使我们有能力解决日常生活中最具挑战性的问题,如癌症预测和自主导航等。然而,这些应用可能并不可靠,如果得不到防范对抗性攻击的保障的话。此外,最近的一些工作表明,一些对抗性例子可以在不同模式中转移。因此,通过强大的模型避免这种可转移性至关重要,这种模型可以抵制对抗对抗性操纵。在本文件中,我们提出了一个基于特异随机化的方法,在测试阶段抵制八次针对深层学习模型的敌对性攻击。我们的新做法包括改变目标网络分类器的培训战略,选择随机特征样本。我们认为,拥有有限了解和半了解条件的攻击者可以进行最普遍的对抗性攻击。我们用众所周知的UNSW-NB15数据集评估我们的方法是否稳健可靠,其中包括现实性和合成性攻击。随后,我们证明我们的战略超越了现有的状态式攻击方法,例如最强大的攻击,其中包括精确地调整网络模型,以对抗性攻击方式对付特定的对抗性攻击。最后,我们通过可靠的对抗性攻击性攻击方法来显示我们的60个目标。我们试验结果能够证明我们的安全性攻击性攻击性攻击性攻击。