We consider high-dimensional multiclass classification by sparse multinomial logistic regression. Unlike binary classification, in the multiclass setup one can think about an entire spectrum of possible notions of sparsity associated with different structural assumptions on the regression coefficients matrix. We propose a computationally feasible feature selection procedure based on penalized maximum likelihood with convex penalties capturing a specific type of sparsity at hand. In particular, we consider global sparsity, double row-wise sparsity, and low-rank sparsity, and show that with the properly chosen tuning parameters the derived plug-in classifiers attain the minimax generalization error bounds (in terms of misclassification excess risk) within the corresponding classes of multiclass sparse linear classifiers. The developed approach is general and can be adapted to other types of sparsity as well.
翻译:我们考虑通过稀疏的多度物流回归进行高维多级分类。 与二元分类不同,在多级设置中,人们可以考虑与回归系数矩阵上不同结构假设相关的一系列可能的宽度概念。 我们提出一个计算上可行的特有选择程序,其依据是惩罚性最大可能性,同时通过分流惩罚来捕捉手头的特定的宽度。 特别是,我们考虑的是全球的宽度、双行宽度和低级别宽度,并表明通过适当选择的调试参数,衍生的插件分类者在多级分散的线性分类者的相应类别中达到了微缩一般误差界限(即分类错误分类过重风险 ) 。 发达的方法是一般性的,可以适用于其他种类的宽度和宽度。