The representational capacity of modern neural network architectures has made them a default choice in various applications with high dimensional feature sets. But these high dimensional and potentially noisy features combined with the black box models like neural networks negatively affect the interpretability, generalizability, and the training time of these models. Here, I propose two integrated approaches for feature selection that can be incorporated directly into the parameter learning. One of them involves adding a drop-in layer and performing sequential weight pruning. The other is a sensitivity-based approach. I benchmarked both the methods against Permutation Feature Importance (PFI) - a general-purpose feature ranking method and a random baseline. The suggested approaches turn out to be viable methods for feature selection, consistently outperform the baselines on the tested datasets - MNIST, ISOLET, and HAR. We can add them to any existing model with only a few lines of code.
翻译:现代神经网络结构的代表性能力使它们在具有高维特征装置的各种应用中成为默认选择。但这些高维和潜在噪音特征与黑盒模型(如神经网络)相结合,对这些模型的可解释性、可概括性和培训时间产生了负面影响。在这里,我提议了两种可直接纳入参数学习的特征选择综合方法。其中一种是增加一个低位层,并进行顺序加权处理;另一种是基于敏感性的方法。我将这种方法与变异特征重要性(PFI)(一种通用特征排位法)和随机基线(随机基线)作了基准。所建议的方法最终成为选择特征的可行方法,始终超越了测试数据集的基线----MNIST、ISOLET和HAR。我们可以将其添加到任何现有的模型中,只有几行代码。