Machine learning for anomaly detection has become a widely researched field in cybersecurity. Inherent to today's operating environment is the practice of adversarial machine learning, which attempts to circumvent machine learning models. In this work, we examine the feasibility of unsupervised learning and graph-based methods for anomaly detection in the network intrusion detection system setting, as well as leverage an ensemble approach to supervised learning of the anomaly detection problem. We incorporate a realistic adversarial training mechanism when training our supervised models to enable strong classification performance in adversarial environments. Our results indicate that the unsupervised and graph-based methods were outperformed in detecting anomalies (malicious activity) by the supervised stacking ensemble method with two levels. This model consists of three different classifiers in the first level, followed by either a Naive Bayes or Decision Tree classifier for the second level. We see that our model maintains an F1-score above 0.97 for malicious samples across all tested level two classifiers. Notably, Naive Bayes is the fastest level two classifier averaging 1.12 seconds while Decision Tree maintains the highest AUC score of 0.98.
翻译:在网络安全方面,异常现象检测机器学习已成为一个广泛研究的领域。对于今天的运作环境来说,隐含着对抗性机器学习的做法,它试图绕过机器学习模式。在这项工作中,我们研究在网络入侵探测系统设置中以不受监督的学习和图表为基础的异常现象检测方法的可行性,以及利用共同的方法监督地了解异常现象检测问题。我们在培训我们所监督的模型时采用现实的对抗性培训机制,以便能够在敌对环境中进行强有力的分类工作。我们的结果表明,未经监督和基于图表的方法在以两个层次监督的堆叠混合方法探测异常(恶意活动)方面表现得胜于在两级监督的异常(恶意活动)中。这一模式由第一级的三个不同的分类者组成,其次是甲湾或决定树分类者,其次是第二级。我们看到,我们的模式在所有测试的二级分类中,恶意样品的F1-核心高于0.97。值得注意的是,纳米贝斯是平均1.12秒的最快的2级分类者,而决定树保持最高的AUC分数为0.98。