对恶意探测器的单热黑色反毒攻击:一种因果语言示范方法 (Single-Shot Black-Box Adversarial Attacks Against Malware Detectors: A Causal Language Model Approach)

Deep Learning (DL)-based malware detectors are increasingly adopted for early detection of malicious behavior in cybersecurity. However, their sensitivity to adversarial malware variants has raised immense security concerns. Generating such adversarial variants by the defender is crucial to improving the resistance of DL-based malware detectors against them. This necessity has given rise to an emerging stream of machine learning research, Adversarial Malware example Generation (AMG), which aims to generate evasive adversarial malware variants that preserve the malicious functionality of a given malware. Within AMG research, black-box method has gained more attention than white-box methods. However, most black-box AMG methods require numerous interactions with the malware detectors to generate adversarial malware examples. Given that most malware detectors enforce a query limit, this could result in generating non-realistic adversarial examples that are likely to be detected in practice due to lack of stealth. In this study, we show that a novel DL-based causal language model enables single-shot evasion (i.e., with only one query to malware detector) by treating the content of the malware executable as a byte sequence and training a Generative Pre-Trained Transformer (GPT). Our proposed method, MalGPT, significantly outperformed the leading benchmark methods on a real-world malware dataset obtained from VirusTotal, achieving over 24.51\% evasion rate. MalGPT enables cybersecurity researchers to develop advanced defense capabilities by emulating large-scale realistic AMG.

翻译：深度学习(DL) 的恶意软件探测器日益被采用,用于早期发现网络安全中的恶意行为。但是,它们对于对抗性恶意软件变异器的敏感度引起了巨大的安全关切。由维护者生成这种对抗性变异器对于提高基于DL的恶意软件探测器的抗力至关重要。这种必要性导致机器学习研究流涌现, Aversarial Malware 例生成(AMG ), 目的是产生蒸发性对抗性恶意软件变异器, 以保存给定恶意软件的恶意功能。在AMG的研究中, 黑箱方法比白箱方法得到更多的关注。然而, 大多数黑箱 AMG 方法需要与恶意软件变异器进行许多互动, 以生成对抗性恶意软件变异器的软件示例。鉴于大多数恶意软件探测器执行一个查询限制,这可能导致产生非现实性的对抗性对抗性对抗性例子, 由于缺乏隐形, 在实践中可能会检测到这些例子。在这个研究中, 新的 DLLL 因果关系语言模型可以让现实性规避( ) (istraver survicle) (iver scover surver servieward) laver G) load- lader- gregle) exprevateG) roduforglegleglegal depreval- labrestrational declection acregal- lableglegleglegleglegal decreal laftcregal- restaltradelegle a laftcle) restal- beglegleglegleglegleglegleglegle, 通过在通过处理我们的精精精精度的精度的精精度的精度的精度, 的方法, 的方法, 通过一个高的G 的方法可以使我们的G robal