Modern commercial antivirus systems increasingly rely on machine learning to keep up with the rampant inflation of new malware. However, it is well-known that machine learning models are vulnerable to adversarial examples (AEs). Previous works have shown that ML malware classifiers are fragile to the white-box adversarial attacks. However, ML models used in commercial antivirus products are usually not available to attackers and only return hard classification labels. Therefore, it is more practical to evaluate the robustness of ML models and real-world AVs in a pure black-box manner. We propose a black-box Reinforcement Learning (RL) based framework to generate AEs for PE malware classifiers and AV engines. It regards the adversarial attack problem as a multi-armed bandit problem, which finds an optimal balance between exploiting the successful patterns and exploring more varieties. Compared to other frameworks, our improvements lie in three points. 1) Limiting the exploration space by modeling the generation process as a stateless process to avoid combination explosions. 2) Due to the critical role of payload in AE generation, we design to reuse the successful payload in modeling. 3) Minimizing the changes on AE samples to correctly assign the rewards in RL learning. It also helps identify the root cause of evasions. As a result, our framework has much higher black-box evasion rates than other off-the-shelf frameworks. Results show it has over 74\%--97\% evasion rate for two state-of-the-art ML detectors and over 32\%--48\% evasion rate for commercial AVs in a pure black-box setting. We also demonstrate that the transferability of adversarial attacks among ML-based classifiers is higher than the attack transferability between purely ML-based and commercial AVs.
翻译:现代商业反病毒系统日益依赖机器学习,以跟上新恶意软件的急剧膨胀。然而,众所周知,机器学习模式很容易成为对抗性例子(AEs)的弱点。先前的工作表明,ML恶意软件分类器对于白箱对抗性攻击来说是脆弱的。然而,商业反病毒产品中使用的ML模型通常无法为攻击者提供,而只返回硬分类标签。因此,以纯粹黑箱方式评估ML模型和真实世界AV的稳健性以跟上新的恶意软件的急剧膨胀。我们提议了一个基于黑箱的强化学习(RL)框架,为PE恶意软件分类和AV引擎生成AE软件。我们设计了一个基于黑箱的强化学习模式,用于为PE Malward软件分类分类和AV引擎生成AE软件生成AEAE的易变现性(RL)框架。 将对抗性攻击问题视为一个多武装的突变问题,在利用成功模式和探索更多品种之间找到最佳的平衡点。与其他框架相比,我们的改进是三点:(1) 通过模拟的生成的生成过程过程,限制空间的生成过程,以避免变换。