Detecting Zero-Day intrusions has been the goal of Cybersecurity, especially intrusion detection for a long time. Machine learning is believed to be the promising methodology to solve that problem, numerous models have been proposed but a practical solution is still yet to come, mainly due to the limitation caused by the out-of-date open datasets available. In this paper, we take a deep inspection of the flow-based statistical data generated by CICFlowMeter, with six most popular machine learning classification models for Zero-Day attacks detection. The training dataset CIC-AWS-2018 Dataset contains fourteen types of intrusions, while the testing datasets contains eight different types of attacks. The six classification models are evaluated and cross validated on CIC-AWS-2018 Dataset for their accuracy in terms of false-positive rate, true-positive rate, and time overhead. Testing dataset, including eight novel (or Zero-Day) real-life attacks and benign traffic flows collected in real research production network are used to test the performance of the chosen decision tree classifier. Promising results are received with the accuracy as high as 100% and reasonable time overhead. We argue that with the statistical data collected from CICFlowMeter, simple machine learning models such as the decision tree classification could be able to take charge in detecting Zero-Day attacks.
翻译:检测零天入侵是网络安全的目标,特别是入侵探测的长期目标。 据信,机器学习是解决这一问题的有希望的方法,提出了许多模型,但实际解决办法仍有待实现,主要原因是由于日期已过时的开放数据集造成的限制,本文对CICFlowMeter产生的流动统计数据进行了深入检查,为零天袭击探测提供了六种最受欢迎的机器学习分类模型。培训数据集CIC-AWS-2018数据集包含14类入侵,而测试数据集包含8种不同类型的袭击。六种分类模型在CIC-AWS-2018数据集中评估和交叉验证,以显示其假阳性率、真实阳性率和时间管理率的准确性。测试数据集,包括8种新颖(或Zero-Day)真实生命攻击和在实际研究生产网络中收集的友好交通流量模型,用来测试所选择的决策树分类师的绩效。 预测结果接收的准确性是8种不同类型袭击,而测试数据集包含8种不同类型的袭击。 CIC-AWS-2018数据集的准确性,其准确性被评为100 % 和合理机机头数据分类。我们认为,可以进行这样的测试。