Machine Learning (ML) techniques can facilitate the automation of malicious software (malware for short) detection, but suffer from evasion attacks. Many studies counter such attacks in heuristic manners, lacking theoretical guarantees and defense effectiveness. In this paper, we propose a new adversarial training framework, termed Principled Adversarial Malware Detection (PAD), which offers convergence guarantees for robust optimization methods. PAD lays on a learnable convex measurement that quantifies distribution-wise discrete perturbations to protect malware detectors from adversaries, whereby for smooth detectors, adversarial training can be performed with theoretical treatments. To promote defense effectiveness, we propose a new mixture of attacks to instantiate PAD to enhance deep neural network-based measurements and malware detectors. Experimental results on two Android malware datasets demonstrate: (i) the proposed method significantly outperforms the state-of-the-art defenses; (ii) it can harden ML-based malware detection against 27 evasion attacks with detection accuracies greater than 83.45%, at the price of suffering an accuracy decrease smaller than 2.16% in the absence of attacks; (iii) it matches or outperforms many anti-malware scanners in VirusTotal against realistic adversarial malware.
翻译:机器学习技术可以促进恶意软件检测的自动化,但受到对抗攻击的影响。许多研究以启发式方式对抗此类攻击,缺乏理论保证和防御效果。在本文中,我们提出了一种新的对抗训练框架,称为原则性对抗性恶意软件检测(PAD),为强健优化方法提供收敛保证。PAD建立在可学习的凸度量上,以量化分布式离散扰动,保护恶意软件检测器免受攻击者的攻击,因此对于光滑检测器,可以进行理论处理的对抗训练。为了促进防御效果,我们提出了一种新的攻击混合物来实例化PAD,以增强基于深度神经网络的测量和恶意软件检测器。两个Android恶意软件数据集的实验结果表明:(i)该方法明显优于最先进的防御方法;(ii)它可以硬化基于机器学习的恶意软件检测,针对27种对抗攻击具有大于83.45%的检测准确率,在没有攻击的情况下,准确率下降小于2.16%的代价;(iii)与现实对抗性恶意软件中的许多反病毒软件扫描程序匹配或优于VirusTotal中的防病毒扫描程序。