During the past two years, Flash malware has become one of the most insidious threats to detect, with almost 600 critical vulnerabilities targeting Adobe Flash Player disclosed in the wild. Research has shown that machine learning can be successfully used to tackle this increasing variability and sophistication of Flash malware, by simply leveraging static analysis to extract information from the structure of the file or from its bytecode. However, the robustness of such systems against well-crafted evasion attempts - also known as adversarial examples - has never been investigated. In this paper, we first discuss how to craft adversarial Flash malware examples, and show that it suffices to only slightly manipulate them to evade detection. We then empirically demonstrate that popular defense techniques proposed to mitigate such threat, including re-training on adversarial examples, may not always be effective. We argue that this occurs when the feature vectors extracted from adversarial examples become indistinguishable from those of benign data, meaning that the given feature representation is intrinsically vulnerable. In this respect, we are the first to formally define and quantitatively characterize this vulnerability, highlighting when an attack can be countered by solely improving the security of the learning algorithm, or when it requires also considering additional features. We conclude the paper by suggesting alternative research directions to improve the security of learning-based Flash malware detectors.
翻译:过去两年来,闪存恶意软件已成为最隐蔽的侦测威胁之一,野外披露了近600个针对Adobe闪电玩家的关键弱点。研究显示,机器学习能够成功地用来应对闪存恶意软件日益变异和复杂的情况,只是利用静态分析,从文件结构或字节代码中提取信息。然而,这种系统对精心设计的规避尝试的强健性(也称为对抗性实例)从未进行过调查。在本文中,我们首先讨论如何编造对抗性闪存恶意软件范例,并表明只要稍微操纵它们就足以逃避探测。然后,我们从经验上表明,为减轻这种威胁而提出的大众防御技术,包括对抗性实例的再培训,可能并不总是有效。我们争辩说,当从对抗性实例中提取的特性矢量与良性数据中无法分辨时,就会出现这种情况,这意味着特定特征的表示具有内在的脆弱性。在这方面,我们首先正式界定和定量描述这种脆弱性,并表明攻击何时可以仅仅通过仅仅改进学习策略的安全性来对付这些弱点。我们还要考虑如何改进安全测算器。