通过稀少的多军种后勤回归,多等级分类 (Multiclass classification by sparse multinomial logistic regression)

In this paper we consider high-dimensional multiclass classification by sparse multinomial logistic regression. We propose first a feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the nonasymptotic bounds for misclassification excess risk of the resulting classifier. We establish also their tightness by deriving the corresponding minimax lower bounds. In particular, we show that there exist two regimes corresponding to small and large number of classes. The bounds can be reduced under the additional low noise condition. To find a penalized maximum likelihood solution with a complexity penalty requires, however, a combinatorial search over all possible models. To design a feature selection procedure computationally feasible for high-dimensional data, we propose multinomial logistic group Lasso and Slope classifiers and show that they also achieve the minimax order.

翻译：在本文中,我们考虑通过稀少的多等后勤回归进行高维多级分类。我们首先提议基于最易受处罚的特有选择程序,对模型尺寸进行复杂处罚,并得出对由此产生的分类师的分类错误过度风险的不防患未然的界限。我们还通过得出相应的微缩轴下限来确定其紧凑性。我们特别表明存在两种与小类和大类相对应的制度。在额外的低噪声条件下,限制范围可以缩小。然而,要找到最易受处罚的、最易受复杂处罚的解决方案,则需要对所有可能的模型进行组合搜索。为了设计高维数据在计算上可行的特有选择程序,我们建议采用多等后勤组Lasso和Slope分类师,并表明它们也达到了微缩轴。

相关内容

多项逻辑回归

关注 5

多元逻辑回归模型的理论前提相对判别分析法要宽松得多，且没有关于分布类型、协方差阵等方面的严格假定。不过，在大量运用多元逻辑回归的研究中往往忽视了另一个相当重要的问题，即模型自变量之间可能存在的多重共线性干扰。与其他多元回归方法一样，Logistic回归模型也对多元共线性敏感。当变量之间的相关程度提高时，系数估计的标准误将会急剧增加；同时，系数对样本和模型设置都非常敏感，模型设置的微小变化、在样本总体中加入或删除案例等变动，都会导致系数估计的较大变化。

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

专知会员服务

39+阅读 · 2020年11月3日