Feature selection that selects an informative subset of variables from data not only enhances the model interpretability and performance but also alleviates the resource demands. Recently, there has been growing attention on feature selection using neural networks. However, existing methods usually suffer from high computational costs when applied to high-dimensional datasets. In this paper, inspired by evolution processes, we propose a novel resource-efficient supervised feature selection method using sparse neural networks, named \enquote{NeuroFS}. By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently. By performing several experiments on $11$ low and high-dimensional real-world benchmarks of different types, we demonstrate that NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models. The code is available on GitHub.
翻译:从数据中选择一个信息丰富的变量子集的特性选择,不仅能够增强模型的解释性和性能,还能缓解资源需求。最近,人们日益关注利用神经网络选择特征的问题。然而,在应用高维数据集时,现有方法通常会面临高昂的计算成本。在本文中,在进化过程的启发下,我们提出了一个新的资源效率监督特征选择方法,使用稀疏的神经网络,名为\enquote{NeuroFS}。通过逐步从从零开始训练的稀薄神经网络输入层中清除非信息性特征,NeuroFS获得了一系列信息性特征。通过对不同类型低维度和高维度真实世界基准的11美元进行多次实验,我们证明NeuroFS在所考虑的状态、受监督的特征选择模型中取得了最高的排名。该代码可在GitHub查阅。</s>