Machine Learning (ML) techniques facilitate automating malicious software (malware for short) detection, but suffer from evasion attacks. Many researchers counter such attacks in heuristic manners short of both theoretical guarantees and defense effectiveness. We hence propose a new adversarial training framework, termed Principled Adversarial Malware Detection (PAD), which encourages convergence guarantees for robust optimization methods. PAD lays on a learnable convex measurement that quantifies distribution-wise discrete perturbations and protects the malware detector from adversaries, by which for smooth detectors, adversarial training can be performed heuristically with theoretical treatments. To promote defense effectiveness, we propose a new mixture of attacks to instantiate PAD for enhancing the deep neural network-based measurement and malware detector. Experimental results on two Android malware datasets demonstrate: (i) the proposed method significantly outperforms the state-of-the-art defenses; (ii) it can harden the ML-based malware detection against 27 evasion attacks with detection accuracies greater than 83.45%, while suffering an accuracy decrease smaller than 2.16% in the absence of attacks; (iii) it matches or outperforms many anti-malware scanners in VirusTotal service against realistic adversarial malware.
翻译:机器学习( ML) 技术便利了恶意软件( 短短软件) 的检测自动化,但遭受了规避的攻击。 许多研究人员以超常方式对抗这种攻击,但缺乏理论保障和防御效力。 因此,我们提出了一个新的对抗性培训框架,称为“ 原则反反向软件探测( PAD ),这鼓励了强力优化方法的趋同保障。 PAD 的实验结果显示:(一) 拟议的方法大大超越了国家技术防御系统;(二) 它可以使基于ML的恶意检测系统对27次规避攻击的恶意检测系统更加坚固,其检测能力超过83.45%;为了提高防御效果,我们提议了一种新型攻击组合,即时将PAD用于增强基于深度神经网络的测量和恶意软件探测器。 两个机器人恶意软件数据集的实验结果显示:(一) 拟议方法大大超越了国家技术防御系统;(二) 它可以使基于ML的恶意检测系统对27次规避攻击的恶意检测系统,其检测能力超过83.45%,同时在缺乏实际的磁标中要降低2.16%。