The rapid growth in both the scale and complexity of Android malware has driven the widespread adoption of machine learning (ML) techniques for scalable and accurate malware detection. Despite their effectiveness, these models remain vulnerable to adversarial attacks that introduce carefully crafted feature-level perturbations to evade detection while preserving malicious functionality. In this paper, we present LAMLAD, a novel adversarial attack framework that exploits the generative and reasoning capabilities of large language models (LLMs) to bypass ML-based Android malware classifiers. LAMLAD employs a dual-agent architecture composed of an LLM manipulator, which generates realistic and functionality-preserving feature perturbations, and an LLM analyzer, which guides the perturbation process toward successful evasion. To improve efficiency and contextual awareness, LAMLAD integrates retrieval-augmented generation (RAG) into the LLM pipeline. Focusing on Drebin-style feature representations, LAMLAD enables stealthy and high-confidence attacks against widely deployed Android malware detection systems. We evaluate LAMLAD against three representative ML-based Android malware detectors and compare its performance with two state-of-the-art adversarial attack methods. Experimental results demonstrate that LAMLAD achieves an attack success rate (ASR) of up to 97%, requiring on average only three attempts per adversarial sample, highlighting its effectiveness, efficiency, and adaptability in practical adversarial settings. Furthermore, we propose an adversarial training-based defense strategy that reduces the ASR by more than 30% on average, significantly enhancing model robustness against LAMLAD-style attacks.
翻译:安卓恶意软件在规模和复杂性上的快速增长,推动了机器学习技术在可扩展且准确的恶意软件检测中的广泛应用。尽管这些模型效果显著,但它们仍然容易受到对抗攻击的影响,此类攻击通过精心设计的特征级扰动来逃避检测,同时保留恶意功能。本文提出LAMLAD,一种新颖的对抗攻击框架,该框架利用大语言模型的生成和推理能力来绕过基于机器学习的安卓恶意软件分类器。LAMLAD采用双智能体架构,由一个LLM操纵器和一个LLM分析器组成。LLM操纵器负责生成真实且保持功能的特征扰动,而LLM分析器则引导扰动过程走向成功规避。为了提高效率和上下文感知能力,LAMLAD将检索增强生成技术集成到LLM流程中。针对Drebin风格的特征表示,LAMLAD能够对广泛部署的安卓恶意软件检测系统发起隐蔽且高置信度的攻击。我们在三种具有代表性的基于机器学习的安卓恶意软件检测器上评估LAMLAD,并将其性能与两种最先进的对抗攻击方法进行比较。实验结果表明,LAMLAD的攻击成功率高达97%,每个对抗样本平均仅需三次尝试,这凸显了其在实际对抗环境中的有效性、效率和适应性。此外,我们提出了一种基于对抗训练的防御策略,该策略平均将攻击成功率降低了30%以上,显著增强了模型抵御LAMLAD风格攻击的鲁棒性。