Network analysis and machine learning techniques have been widely applied for building malware detection systems. Though these systems attain impressive results, they often are $(i)$ not extensible, being monolithic, well tuned for the specific task they have been designed for but very difficult to adapt and/or extend to other settings, and $(ii)$ not interpretable, being black boxes whose inner complexity makes it impossible to link the result of detection with its root cause, making further analysis of threats a challenge. In this paper we present RADAR, an extensible and explainable system that exploits the popular TTP (Tactics, Techniques, and Procedures) ontology of adversary behaviour described in the industry-standard MITRE ATT\&CK framework in order to unequivocally identify and classify malicious behaviour using network traffic. We evaluate RADAR on a very large dataset comprising of 2,286,907 malicious and benign samples, representing a total of 84,792,452 network flows. The experimental analysis confirms that the proposed methodology can be effectively exploited: RADAR's ability to detect malware is comparable to other state-of-the-art non-interpretable systems' capabilities. To the best of our knowledge, RADAR is the first TTP-based system for malware detection that uses machine learning while being extensible and explainable.
翻译:RADAR: 一种基于TTP的可扩展、可解释和有效的网络流量分析和恶意软件检测系统
翻译后的摘要:
网络分析和机器学习技术已经被广泛应用于构建恶意软件检测系统。虽然这些系统取得了令人瞩目的结果,但它们通常 $(i)$ 不可扩展,是单一的、为特定任务进行了精细调整的系统,但很难适应和/或扩展到其他设置,以及 $(ii)$ 不可解释,是黑盒子,其内在复杂性使得将检测结果与其根本原因联系起来变得不可能,进一步分析威胁成为一项挑战。本文提出了RADAR, 一种可扩展和可解释的系统,它利用了协议的 TTP (战术、技术和过程) 本体论述中描述的对手行为,在使用网络流量时不含糊地标识和分类恶意行为。我们在一个非常大的数据集上对RADAR进行了评估,包括2,286,907个恶意和良性的样本,表示总共的84,792,452个网络流量。实验分析证实所提出的方法可以有效地利用: RADAR检测恶意软件的能力与其他最先进的不可解释系统的能力相当。据我们所知,RADAR是第一个使用机器学习,同时具有可扩展性和可解释性的基于TTP的恶意软件检测系统。