The effectiveness of machine learning models is significantly affected by the size of the dataset and the quality of features as redundant and irrelevant features can radically degrade the performance. This paper proposes IGRF-RFE: a hybrid feature selection method tasked for multi-class network anomalies using a Multilayer perceptron (MLP) network. IGRF-RFE can be considered as a feature reduction technique based on both the filter feature selection method and the wrapper feature selection method. In our proposed method, we use the filter feature selection method, which is the combination of Information Gain and Random Forest Importance, to reduce the feature subset search space. Then, we apply recursive feature elimination(RFE) as a wrapper feature selection method to further eliminate redundant features recursively on the reduced feature subsets. Our experimental results obtained based on the UNSW-NB15 dataset confirm that our proposed method can improve the accuracy of anomaly detection while reducing the feature dimension. The results show that the feature dimension is reduced from 42 to 23 while the multi-classification accuracy of MLP is improved from 82.25% to 84.24%.
翻译:机器学习模型的效力受到数据集大小和特性质量的重大影响,因为冗余和不相关特性可以从根本上降低性能。本文件提议采用IGRF-RFE:使用多层感应器网络(MLP)网络,对多级网络异常进行混合特征选择方法。IGRF-RFE可以被视为一种基于过滤特征选择法和包装特征选择方法的减少特征技术。在我们提议的方法中,我们使用过滤特征选择方法,即信息增益和随机森林重要性相结合的方法,以减少特性子集搜索空间。然后,我们采用循环特性消除方法作为包装特征选择方法,以进一步消除在减少的特性子集上的冗余特征。我们根据UNSW-NB15数据集获得的实验结果证实,我们拟议的方法可以提高异常检测的准确性,同时减少特征的维度。结果显示,特征维度从42个减少到23个,而MLP的多级精确度则从82.25%提高到84.24%。