We present a new algorithm to train a robust malware detector. Modern malware detectors rely on machine learning algorithms. Now, the adversarial objective is to devise alterations to the malware code to decrease the chance of being detected whilst preserving the functionality and realism of the malware. Adversarial learning is effective in improving robustness but generating functional and realistic adversarial malware samples is non-trivial. Because: i) in contrast to tasks capable of using gradient-based feedback, adversarial learning in a domain without a differentiable mapping function from the problem space (malware code inputs) to the feature space is hard; and ii) it is difficult to ensure the adversarial malware is realistic and functional. This presents a challenge for developing scalable adversarial machine learning algorithms for large datasets at a production or commercial scale to realize robust malware detectors. We propose an alternative; perform adversarial learning in the feature space in contrast to the problem space. We prove the projection of perturbed, yet valid malware, in the problem space into feature space will always be a subset of adversarials generated in the feature space. Hence, by generating a robust network against feature-space adversarial examples, we inherently achieve robustness against problem-space adversarial examples. We formulate a Bayesian adversarial learning objective that captures the distribution of models for improved robustness. We prove that our learning method bounds the difference between the adversarial risk and empirical risk explaining the improved robustness. We show that adversarially trained BNNs achieve state-of-the-art robustness. Notably, adversarially trained BNNs are robust against stronger attacks with larger attack budgets by a margin of up to 15% on a recent production-scale malware dataset of more than 20 million samples.
翻译:我们提出了一个新的算法来训练强大的恶意软件探测器。 现代恶意软件探测器依赖于机器学习算法。 现在, 对抗性的目标是设计对恶意软件代码的修改, 以减少在保存恶意软件功能和现实性的同时被检测的机会。 反向学习在提高稳健性方面是有效的, 但生成功能和现实的对抗性恶意软件样本是非三角的。 因为:( i) 与能够使用基于梯度的反馈的任务相比, 在一个域内进行对抗性学习而没有与问题空间( 软件代码输入) 不同的映射功能; 以及 (ii) 很难在维护恶意软件的功能和现实性的同时减少被检测的机会。 这对开发可扩缩的对立性机器在生产或商业规模上的大型数据集学习算法是一个挑战。 我们提议了一种替代办法; 在特征空间内进行对抗问题空间的对抗性反向性反向的对抗性对抗性对抗性, 我们证明了对质性对抗性对抗性攻击的对抗性攻击的对抗性反应性反应性反应性功能更强, 我们用经过训练的硬性辩论性模型来解释了对空空域的对空性攻击的对立性模型 。 我们通过不断的模型的模型的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的反向的演进进进进进进进进进进进进进进进进进进进进进进进进进进