Intrusion detection has been a key topic in the field of cyber security, and the common network threats nowadays have the characteristics of varieties and variation. Considering the serious imbalance of intrusion detection datasets will result in low classification performance on attack behaviors of small sample size and difficulty to detect network attacks accurately and efficiently, using ADASYN oversampling method to balance datasets was proposed in this paper. In addition, random forest algorithm was used to train intrusion detection classifiers. Through the comparative experiment of Intrusion detection on CICIDS 2017 dataset, it is found that ADASYN with Random Forest performs better. Based on the experimental results, the improvement of precision, recall and F1 values after ADASYN is then analyzed. Experiments show that the proposed method can be applied to intrusion detection with large data, and can effectively improve the classification accuracy of network attack behaviors. Compared with traditional machine learning models, it has better performance, generalization ability and robustness.
翻译:入侵探测是网络安全领域的一个关键主题,目前共同的网络威胁具有各种和变异的特点。考虑到入侵探测数据集严重失衡,使用ADASYN过度抽样方法来平衡数据集,使用ADASYN过度抽样方法来培训入侵探测分类器。此外,随机森林算法用于培训入侵探测分类器。通过CICIDS 2017数据集入侵探测的比较实验,发现有随机森林的ADASYN表现更好。根据实验结果,对ADASYN之后的精确度、回溯和F1值的改进进行了分析。实验表明,拟议的方法可用于用大数据进行入侵探测,并能够有效地提高网络袭击行为的分类精度。与传统的机器学习模型相比,它具有更好的性能、通用能力和稳健性。