The knowledge on attacks contained in Cyber Threat Intelligence (CTI) reports is very important to effectively identify and quickly respond to cyber threats. However, this knowledge is often embedded in large amounts of text, and therefore difficult to use effectively. To address this challenge, we propose a novel approach and tool called EXTRACTOR that allows precise automatic extraction of concise attack behaviors from CTI reports. EXTRACTOR makes no strong assumptions about the text and is capable of extracting attack behaviors as provenance graphs from unstructured text. We evaluate EXTRACTOR using real-world incident reports from various sources as well as reports of DARPA adversarial engagements that involve several attack campaigns on various OS platforms of Windows, Linux, and FreeBSD. Our evaluation results show that EXTRACTOR can extract concise provenance graphs from CTI reports and show that these graphs can successfully be used by cyber-analytics tools in threat-hunting.
翻译:网络威胁情报(CTI)报告中关于攻击的知识对于有效识别和迅速应对网络威胁非常重要。然而,这一知识往往嵌入大量文本中,因此难以有效使用。为了应对这一挑战,我们提议了一个名为EXTRACTOR的新颖方法和工具,可以精确地自动提取CTI报告中的简明攻击行为。EXTRACTOR对文本没有做出强烈的假设,能够从无结构文本中提取攻击行为的引文图。我们利用来自不同来源的真实世界事件报告以及DARPA对抗性协议的报告来评估EXTRACTOR,这些协议涉及对视窗、Linux和FreeBSD等各种OS平台的几次攻击运动。我们的评估结果表明,EXTRACTOR可以从C报告提取简明的出处图,并表明这些图表可以成功地被网络分析工具用于威胁搜索。