Feature selection identifies subsets of informative features and reduces dimensions in the original feature space, helping provide insights into data generation or a variety of domain problems. Existing methods mainly depend on feature scoring functions or sparse regularizations; nonetheless, they have limited ability to reconcile the representativeness and inter-correlations of features. In this paper, we introduce a novel, simple yet effective regularization approach, named top-$k$ regularization, to supervised feature selection in regression and classification tasks. Structurally, the top-$k$ regularization induces a sub-architecture on the architecture of a learning model to boost its ability to select the most informative features and model complex nonlinear relationships simultaneously. Theoretically, we derive and mathematically prove a uniform approximation error bound for using this approach to approximate high-dimensional sparse functions. Extensive experiments on a wide variety of benchmarking datasets show that the top-$k$ regularization is effective and stable for supervised feature selection.
翻译:特征选择确定了信息特性的子集,减少了原始特征空间的维度,有助于对数据生成或各种领域问题提供洞察力。现有方法主要取决于特征评分功能或稀疏的正规化;然而,这些方法在调和特征的代表性和相互关系方面能力有限。在本文件中,我们引入了一种创新的、简单而有效的正规化方法,名为最高-百万美元的正规化,以监督回归和分类任务中的特征选择。从结构上看,最高-百万美元的正规化在学习模型的架构上产生了一个子结构,以提高其同时选择最信息特性和模型复杂非线性关系的能力。理论上,我们得出并在数学上证明一个统一的近似错误,将这种方法用于近似高维度的稀薄功能。关于各种基准数据集的广泛实验表明,最高-百万美元的正规化对于受监督的特征选择是有效和稳定的。