Laplace 内核的内核地物选择 - - - - - 舒适属性中的坦坦不通性 (Taming Nonconvexity in Kernel Feature Selection---Favorable Properties of the Laplace Kernel)

Kernel-based feature selection is an important tool in nonparametric statistics. Despite many practical applications of kernel-based feature selection, there is little statistical theory available to support the method. A core challenge is the objective function of the optimization problems used to define kernel-based feature selection are nonconvex. The literature has only studied the statistical properties of the \emph{global optima}, which is a mismatch, given that the gradient-based algorithms available for nonconvex optimization are only able to guarantee convergence to local minima. Studying the full landscape associated with kernel-based methods, we show that feature selection objectives using the Laplace kernel (and other $\ell_1$ kernels) come with statistical guarantees that other kernels, including the ubiquitous Gaussian kernel (or other $\ell_2$ kernels) do not possess. Based on a sharp characterization of the gradient of the objective function, we show that $\ell_1$ kernels eliminate unfavorable stationary points that appear when using an $\ell_2$ kernel. Armed with this insight, we establish statistical guarantees for $\ell_1$ kernel-based feature selection which do not require reaching the global minima. In particular, we establish model-selection consistency of $\ell_1$-kernel-based feature selection in recovering main effects and hierarchical interactions in the nonparametric setting with $n \sim \log p$ samples.

翻译：基于内核的特性选择是非参数统计中的一个重要工具。尽管基于内核的特性选择有许多实际应用, 但没有多少可用的统计理论来支持此方法。核心挑战在于用于定义内核特性选择的优化问题客观功能。文献只研究了用于非Convex优化的基于梯度的算法的统计属性, 鉴于用于非Convex优化的基于梯度的算法只能保证与本地最小值的趋同。研究与内核的特性选择方法相关的全部景观时, 我们显示, 使用 Laplace 内核( 和其他$/ ell_ 1美元内核) 的特性选择目标具有统计保证性, 其他内核核核( 或其他$/ ell_ 2美元内核内核) 的统计性属性, 因为基于对基于内核功能的梯度的模型描述, 我们显示 $_ 1 内核内核消除不可变的固定的定点, 当使用 $_ 内核内核的内核的内核内核的内核的内核内核的内核内核内核内核内核内核内, 的内置的内核内核内核内核的内核内核内核内核内核内核内核的内核内核内核的内核内核内核内核内核内核内核内核内核内核内核内核内核内, 的内核内核内核内核内核内核内核内核的内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内置的内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核内核的内置的内置的内核内核的内核的内核的内核的内核的内核的内核, 的内核的内定的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核内核内核内核内核内核内

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日