Recent work has shown that adversarial Windows malware samples - referred to as adversarial EXEmples in this paper - can bypass machine learning-based detection relying on static code analysis by perturbing relatively few input bytes. To preserve malicious functionality, previous attacks either add bytes to existing non-functional areas of the file, potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard malware variants that do not correctly execute in sandbox environments. In this work, we overcome these limitations by developing a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks based on practical, functionality-preserving manipulations to the Windows Portable Executable (PE) file format. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section. Our experimental results show that these attacks outperform existing ones in both white-box and black-box scenarios, achieving a better trade-off in terms of evasion rate and size of the injected payload, while also enabling evasion of models that have been shown to be robust to previous attacks. To facilitate reproducibility of our findings, we open source our framework and all the corresponding attack implementations as part of the secml-malware Python library. We conclude this work by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts directly into the learning process.
翻译:最近的工作表明,对抗式Windows 恶意软件样本----本文中称为对抗式示范文件----可以绕过基于静态代码分析的基于静态代码的机器学习检测,通过扰动相对较少的输入字节。为了维护恶意功能,以前的袭击要么将字节添加到文件的现有不功能领域,可能限制其有效性,要么要求运行计算式验证步骤,丢弃在沙箱环境中无法正确执行的恶意软件变体。在这项工作中,我们通过制定统一框架克服这些局限性,该框架不仅包括并概括以往对机器学习模型的袭击,而且还包括基于对视窗可移动可执行文件格式的实际、功能保留操作的三次新式袭击。这些袭击,称为全 DOS、扩展和 Shift,通过分别操纵 DOS 头版、 扩展和 转换第一个部分的内容来注入对抗性有效载荷。我们的实验结果表明,这些攻击超越了在白箱和黑箱情景中的现有变体,在对机器有效载荷的规避率和大小方面实现更好的交易,还包括以实际操作性操作性操纵为基础的操作,同时,还能够直接规避当前可移动的内脏定义,通过我们当前攻击的内程的内存的内存研究,从而重新学习我们所有的内存的内存的内存,从而显示我们所有的内存的内存的内存的内存的内存。