We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem. The features are selected one by one sequentially, in an iterative loss minimization procedure. We use neural networks as predictors in the algorithm to compute the loss and hence, we refer to our method as neural greedy pursuit (NGP). NGP is efficient in selecting $N$ features when $N \ll P$, and it provides a notion of feature importance in a descending order following the sequential selection procedure. We experimentally show that NGP provides better performance than several feature selection methods such as DeepLIFT and Drop-one-out loss. In addition, we experimentally show a phase transition behavior in which perfect selection of all $N$ features without false positives is possible when the training data size exceeds a threshold.
翻译:我们建议一种贪婪的算法,在美元输入功能中为非线性预测问题选择美元的重要特性。 这些特性是按顺序逐个选择的,采用迭代损失最小化程序。 我们使用神经网络作为算法中的预测器来计算损失,因此,我们称我们的方法为神经贪婪追求(NGP ) 。 NGP 有效地选择了美元特性,当美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=1美元=美元=美元=美元=1美元=美元=美元=非线性预测问题。我们试验性地显示NGPPP提供的业绩优于一些特性选择方法,例如深海LIFT和抛出1损失。此外,我们实验性地显示,在训练数据超过临界值时,可以完全选择所有非正值为美元=非正值。