Insider threats are a growing concern for organizations due to the amount of damage that their members can inflict by combining their privileged access and domain knowledge. Nonetheless, the detection of such threats is challenging, precisely because of the ability of the authorized personnel to easily conduct malicious actions and because of the immense size and diversity of audit data produced by organizations in which the few malicious footprints are hidden. In this paper, we propose an unsupervised insider threat detection system based on audit data using Bayesian Gaussian Mixture Models. The proposed approach leverages a user-based model to optimize specific behaviors modelization and an automatic feature extraction system based on Word2Vec for ease of use in a real-life scenario. The solution distinguishes itself by not requiring data balancing nor to be trained only on normal instances, and by its little domain knowledge required to implement. Still, results indicate that the proposed method competes with state-of-the-art approaches, presenting a good recall of 88\%, accuracy and true negative rate of 93%, and a false positive rate of 6.9%. For our experiments, we used the benchmark dataset CERT version 4.2.
翻译:内部威胁是各组织日益关切的一个问题,因为其成员通过将特权访问和领域知识结合起来可以造成大量损害,因此内部威胁日益引起各组织的关注。然而,发现这种威胁具有挑战性,原因恰恰在于授权人员容易采取恶意行动的能力,而且因为隐藏少量恶意足迹的组织产生的审计数据规模巨大,种类繁多。在本文件中,我们提议根据使用Bayesian Gaussian Mixture模型的审计数据建立一个不受监督的内部威胁探测系统。拟议方法利用基于用户的模式优化具体行为模型和基于Word2Vec的自动特征提取系统,以便于在现实生活中使用。解决方案本身的区别在于不要求数据平衡,也不只进行正常情况下的培训,而且其执行所需的领域知识很少。但结果显示,拟议方法与最新方法相竞争,很好地回顾了88 ⁇ 、准确率和真实负率93%,以及6.9%的假正率。我们进行实验时,我们使用了基准数据集CERT 4.2版本。