Cyber threat hunting is a proactive search process for hidden threats in the organization's information system. It is a crucial component of active defense against advanced persistent threats (APTs). However, most of the current threat hunting methods rely on Cyber Threat Intelligence(CTI), which can find known attacks but cannot find unknown attacks that have not been disclosed by CTI. In this paper, we propose LogKernel, a threat hunting method based on graph kernel clustering which can effectively separates attack behaviour from benign activities. LogKernel first abstracts system audit logs into Behaviour Provenance Graphs (BPGs), and then clusters graphs by embedding them into a continuous space using a graph kernel. In particular, we design a new graph kernel clustering method based on the characteristics of BPGs, which can capture structure information and rich label information of the BPGs. To reduce false positives, LogKernel further quantifies the threat of abnormal behaviour. We evaluate LogKernel on the malicious dataset which includes seven simulated attack scenarios and the DAPRA CADETS dataset which includes four attack scenarios. The result shows that LogKernel can hunt all attack scenarios among them, and compared to the state-of-the-art methods, it can find unknown attacks.
翻译:网络威胁狩猎是该组织信息系统中隐蔽威胁的主动搜索过程,是积极防范先进持续威胁(APTs)的关键组成部分。然而,目前大多数威胁狩猎方法依靠网络威胁情报(CTI),它可以发现已知的攻击,但无法找到CTI没有披露的未知攻击。在本文中,我们提议使用基于图形内核集群的一种威胁狩猎方法LogKernel,它可以有效地将攻击行为与无害活动区分开来。LogKernel首先对行为预测图进行摘要系统审计,然后用图形内核将其嵌入连续空间,从而将数据组集图纳入连续空间。特别是,我们根据BPGs的特点设计了新的图形内核集群方法,它可以捕捉BPGs的结构信息和丰富的标签信息。为了减少假阳性,LogKernel将异常行为的威胁进一步量化。我们评估恶意数据集的LogKernel,其中包括7个模拟攻击假想和DAPRA CADETS数据集,其中包括4个攻击假想。结果显示,所有不明的搜索方法可以用来比较攻击情况。