Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality, by either selecting a subset of features or removing unrelated ones. This paper presents a new feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) in the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a new version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The new feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Non-parametric statistical tests are conducted to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.
翻译:选择和减少特征是关键问题,在机器学习、数据挖掘和模式识别领域广泛使用技术,通过选择一组特征或删除不相干特征,克服众所周知的多面性诅咒现象。本文件介绍了一种新的特征选择方法,有效减少属性,从而选择数据集中最丰富的特征特征。它由两个部分组成:(1) 特征子评价措施,和(2) 搜索战略。在评价措施中,我们采用了低近似基底的烟雾槽特征选择(L-FFRFS)中的模糊低底线依赖度(FRFDD),因为它在特征选择方面是有效的。关于搜索战略,提出了新版本的二进式摇动青蛙跳动算法(B-SFLA),通过将B-SFLA与FDD混合而获得新的特征选择方法。进行非参数统计测试是为了将拟议的方法与UCI储存的二十两个数据集(包括九个高维和大个)的现有方法进行比较。实验结果显示,BSFLA的精准度方法的精度符合B-SFALA的其他方法。