Feature selection plays a vital role in promoting the classifier's performance. However, current methods ineffectively distinguish the complex interaction in the selected features. To further remove these hidden negative interactions, we propose a GA-like dynamic probability (GADP) method with mutual information which has a two-layer structure. The first layer applies the mutual information method to obtain a primary feature subset. The GA-like dynamic probability algorithm, as the second layer, mines more supportive features based on the former candidate features. Essentially, the GA-like method is one of the population-based algorithms so its work mechanism is similar to the GA. Different from the popular works which frequently focus on improving GA's operators for enhancing the search ability and lowering the converge time, we boldly abandon GA's operators and employ the dynamic probability that relies on the performance of each chromosome to determine feature selection in the new generation. The dynamic probability mechanism significantly reduces the parameter number in GA that making it easy to use. As each gene's probability is independent, the chromosome variety in GADP is more notable than in traditional GA, which ensures GADP has a wider search space and selects relevant features more effectively and accurately. To verify our method's superiority, we evaluate our method under multiple conditions on 15 datasets. The results demonstrate the outperformance of the proposed method. Generally, it has the best accuracy. Further, we also compare the proposed model to the popular heuristic methods like POS, FPA, and WOA. Our model still owns advantages over them.
翻译:然而,目前的方法无法有效地区分选定特征中复杂的互动。为了进一步消除这些隐藏的负面互动,我们建议一种类似于GA的动态概率(GADP)方法,以具有两层结构的相互信息为基础。第一层采用相互信息方法,以获得一个主要特征子集。GA的动态概率算法,作为第二层,以前候选特征为基础,大大降低了GA的参数。基本上,类似于GA的方法是一种基于人口的算法,因此它的工作机制与GA相类似。与经常侧重于改进GA操作员的优势以提高搜索能力和降低趋同时间的流行工作不同,我们大胆地放弃了GA的操作员,并使用了依赖每个染色体的性能来决定新一代的特征选择的动态概率。动态概率机制大大降低了GA中易于使用的参数。由于每一种基因模型都是独立的,GADP的染色体多样性比传统的GA更为明显,后者确保GADP拥有更广泛的搜索能力并降低聚合时间的优势,我们又以更精确的方式选择了我们所采用的方法。