Intrusion detection has been a key topic in the field of cyber security, and the common network threats nowadays have the characteristics of varieties and variation. Considering the serious imbalance of intrusion detection datasets will result in low classification performance on attack behaviors of small sample size and difficulty to detect network attacks accurately and efficiently, using Adaptive Synthetic Sampling (ADASYN) method to balance datasets was proposed in this paper. In addition, Random Forest algorithm was used to train intrusion detection classifiers. Through the comparative experiment of Intrusion detection on CICIDS 2017 dataset, it is found that ADASYN with Random Forest performs better. Based on the experimental results, the improvement of precision, recall, F1 scores and AUC values after ADASYN is then analyzed. Experiments show that the proposed method can be applied to intrusion detection with large data, and can effectively improve the classification accuracy of network attack behaviors. Compared with traditional machine learning models, it has better performance, generalization ability and robustness.
翻译:入侵探测是网络安全领域的一个关键主题,而当今共同的网络威胁具有各种和变异的特点。考虑到入侵探测数据集严重失衡,使用适应性合成抽样取样法(ADASYN)来平衡数据集,对小型样本规模的攻击行为进行分类性能较低,难以准确、高效地探测网络袭击。此外,随机森林算法用于培训入侵探测分类器。通过CICIDS 2017数据集入侵探测的比较实验,发现有随机森林的ADASYN表现更好。根据实验结果,对ADASYN之后的精确度、回顾、F1分和AUCU值的改进进行了分析。实验表明,拟议的方法可以用大数据用于入侵探测,并能有效地提高网络袭击行为的分类准确性。与传统的机器学习模型相比,它具有更好的性能、概括性能和坚固性。