神经网络特征选择的嵌入方法 (Embedded methods for feature selection in neural networks)

The representational capacity of modern neural network architectures has made them a default choice in various applications with high dimensional feature sets. But these high dimensional and potentially noisy features combined with the black box models like neural networks negatively affect the interpretability, generalizability, and the training time of these models. Here, I propose two integrated approaches for feature selection that can be incorporated directly into the parameter learning. One of them involves adding a drop-in layer and performing sequential weight pruning. The other is a sensitivity-based approach. I benchmarked both the methods against Permutation Feature Importance (PFI) - a general-purpose feature ranking method and a random baseline. The suggested approaches turn out to be viable methods for feature selection, consistently outperform the baselines on the tested datasets - MNIST, ISOLET, and HAR. We can add them to any existing model with only a few lines of code.

翻译：现代神经网络结构的代表性能力使它们在具有高维特征装置的各种应用中成为默认选择。但这些高维和潜在噪音特征与黑盒模型(如神经网络)相结合,对这些模型的可解释性、可概括性和培训时间产生了负面影响。在这里,我提议了两种可直接纳入参数学习的特征选择综合方法。其中一种是增加一个低位层,并进行顺序加权处理;另一种是基于敏感性的方法。我将这种方法与变异特征重要性(PFI)(一种通用特征排位法)和随机基线(随机基线)作了基准。所建议的方法最终成为选择特征的可行方法,始终超越了测试数据集的基线----MNIST、ISOLET和HAR。我们可以将其添加到任何现有的模型中,只有几行代码。

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日