Despite the increased adoption of open-source cyber threat intelligence (OSCTI) for acquiring knowledge about cyber threats, little effort has been made to harvest knowledge from a large number of unstructured OSCTI reports available in the wild (e.g., security articles, threat reports). These reports provide comprehensive threat knowledge in a variety of entities (e.g., IOCs, threat actors, TTPs) and relations, which, however, are hard to gather due to diverse report formats, large report quantities, and complex structures and nuances in the natural language report text. To bridge the gap, we propose ThreatKG, a system for automated open-source cyber threat knowledge gathering and management. ThreatKG automatically collects a large number of OSCTI reports from various sources, extracts high-fidelity threat knowledge, constructs a threat knowledge graph, and updates the knowledge graph by continuously ingesting new knowledge. To address multiple challenges, ThreatKG provides: (1) a hierarchical ontology for modeling a variety of threat knowledge entities and relations; (2) an accurate deep learning-based pipeline for threat knowledge extraction; (3) a scalable and extensible system architecture for threat knowledge graph construction, persistence, updating, and exploration. Evaluations on a large number of reports demonstrate the effectiveness of ThreatKG in threat knowledge gathering and management
翻译:尽管越来越多地采用开放源码网络威胁情报(OSCTI)来获取关于网络威胁的知识,但几乎没有努力从野生的无结构的OSCTI报告(例如安全文章、威胁报告)中获取知识,这些报告为各种实体(例如国际奥委会、威胁行为体、TPTs)和关系提供了全面的威胁知识,但由于报告格式不同、报告数量大、自然语言报告文本的结构和细微差别,难以收集这些知识。为弥合差距,我们提议建立“威胁KG”,这是一个自动化的开放源码网络威胁知识收集和管理系统。“威胁KG”自动收集大量来自各种来源的OSCTI报告,提取高信仰威胁知识,构建威胁知识图表,通过不断获取新知识来更新知识。为了应对多重挑战,威胁KGG提供:(1) 建立各种威胁知识实体和关系的分类;(2) 建立精确的深层次学习管道,用于威胁知识提取;(3) 威胁KGVority系统结构,用于不断威胁知识的大规模开发、持续性评估,并展示威胁知识管理的大规模威胁管理结构。