MAB-Malware:用于攻击静态软件分类器的强化学习框架 (MAB-Malware: A Reinforcement Learning Framework for Attacking Static Malware Classifiers)

Modern commercial antivirus systems increasingly rely on machine learning to keep up with the rampant inflation of new malware. However, it is well-known that machine learning models are vulnerable to adversarial examples (AEs). Previous works have shown that ML malware classifiers are fragile to the white-box adversarial attacks. However, ML models used in commercial antivirus products are usually not available to attackers and only return hard classification labels. Therefore, it is more practical to evaluate the robustness of ML models and real-world AVs in a pure black-box manner. We propose a black-box Reinforcement Learning (RL) based framework to generate AEs for PE malware classifiers and AV engines. It regards the adversarial attack problem as a multi-armed bandit problem, which finds an optimal balance between exploiting the successful patterns and exploring more varieties. Compared to other frameworks, our improvements lie in three points. 1) Limiting the exploration space by modeling the generation process as a stateless process to avoid combination explosions. 2) Due to the critical role of payload in AE generation, we design to reuse the successful payload in modeling. 3) Minimizing the changes on AE samples to correctly assign the rewards in RL learning. It also helps identify the root cause of evasions. As a result, our framework has much higher black-box evasion rates than other off-the-shelf frameworks. Results show it has over 74\%--97\% evasion rate for two state-of-the-art ML detectors and over 32\%--48\% evasion rate for commercial AVs in a pure black-box setting. We also demonstrate that the transferability of adversarial attacks among ML-based classifiers is higher than the attack transferability between purely ML-based and commercial AVs.

翻译：现代商业反病毒系统日益依赖机器学习,以跟上新恶意软件的急剧膨胀。然而,众所周知,机器学习模式很容易成为对抗性例子(AEs)的弱点。先前的工作表明,ML恶意软件分类器对于白箱对抗性攻击来说是脆弱的。然而,商业反病毒产品中使用的ML模型通常无法为攻击者提供,而只返回硬分类标签。因此,以纯粹黑箱方式评估ML模型和真实世界AV的稳健性以跟上新的恶意软件的急剧膨胀。我们提议了一个基于黑箱的强化学习(RL)框架,为PE恶意软件分类和AV引擎生成AE软件。我们设计了一个基于黑箱的强化学习模式,用于为PE Malward软件分类分类和AV引擎生成AE软件生成AEAE的易变现性(RL)框架。将对抗性攻击问题视为一个多武装的突变问题,在利用成功模式和探索更多品种之间找到最佳的平衡点。与其他框架相比,我们的改进是三点:(1) 通过模拟的生成的生成过程过程,限制空间的生成过程,以避免变换。

相关内容

黑盒

关注 0

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【如何做研究】How to research ，22页ppt

专知会员服务

114+阅读 · 2021年4月17日

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

专知会员服务

43+阅读 · 2021年1月11日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日