Mutual Information (MI) based feature selection makes use of MI to evaluate each feature and eventually shortlists a relevant feature subset, in order to address issues associated with high-dimensional datasets. Despite the effectiveness of MI in feature selection, we notice that many state-of-the-art algorithms disregard the so-called unique relevance (UR) of features, and arrive at a suboptimal selected feature subset which contains a non-negligible number of redundant features. We point out that the heart of the problem is that all these MIBFS algorithms follow the criterion of Maximize Relevance with Minimum Redundancy (MRwMR), which does not explicitly target UR. This motivates us to augment the existing criterion with the objective of boosting unique relevance (BUR), leading to a new criterion called MRwMR-BUR. Depending on the task being addressed, MRwMR-BUR has two variants, termed MRwMR-BUR-KSG and MRwMR-BUR-CLF, which estimate UR differently. MRwMR-BUR-KSG estimates UR via a nearest-neighbor based approach called the KSG estimator and is designed for three major tasks: (i) Classification Performance. (ii) Feature Interpretability. (iii) Classifier Generalization. MRwMR-BUR-CLF estimates UR via a classifier based approach. It adapts UR to different classifiers, further improving the competitiveness of MRwMR-BUR for classification performance oriented tasks. The performance of both MRwMR-BUR-KSG and MRwMR-BUR-CLF is validated via experiments using six public datasets and three popular classifiers. Specifically, as compared to MRwMR, the proposed MRwMR-BUR-KSG improves the test accuracy by 2% - 3% with 25% - 30% fewer features being selected, without increasing the algorithm complexity. MRwMR-BUR-CLF further improves the classification performance by 3.8%- 5.5% (relative to MRwMR), and it also outperforms three popular classifier dependent feature selection methods.
翻译:共同信息(MI) 特性选择使MI 用于评价每个特性,最终是短名单相关特性子集,以便解决与高维数据集有关的问题。尽管MI在特征选择方面具有效力,但我们注意到,许多最先进的算法算法忽视了所谓的独特的特性相关性(UR),并得出了一个不最优化的选定特性子集,其中包括一个不可忽略的冗余特性。我们指出,问题的核心是,所有这些MIBFS算法都遵循了与最低重现(MRWMR)的最大化相关性标准,而后者没有明确针对UR。这促使我们增加现有标准,目的是提升独特的相关性(BUR),导致采用称为MRMRMR-BUR的新的标准。根据所处理的任务,MRWMRBR具有两个变量,称为MRMR-MR-BR(MRMR-MR-MRR-LF) 和MR(MR-MRMR-MR-MR-M-ML) 以不同的方式估算。MRMRMR-B-B-B-B-B-B-B-B-BRIS-RIS-RIS-Ral-S-S-I-D-S-I-I-C-IL-IL-IL-L-L-L-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-ILV-I-I-I-I-I-ID-I-I-I-I-I-I-I-I-I-I-I-I-I-I-IL-I-I-I-I-I-I-I-I-I-ID-S-I-IL-ILD-ILDRDRD-IL-ID-ILD-I-I-ID-I-I-I-I-I-I-I-ID-ID-ID-I-I-ID-ID-I-I-I-I-I-I-I-I-I-I-