In this manuscript, we consider the problem of kernel classification under the Gaussian data design, and under source and capacity assumptions on the dataset. While the decay rates of the prediction error have been extensively studied under much more generic assumptions for kernel ridge regression, deriving decay rates for the classification problem has been hitherto considered a much more challenging task. In this work we leverage recent analytical results for learning curves of linear classification with generic loss function to derive the rates of decay of the misclassification (prediction) error with the sample complexity for two standard classification settings, namely margin-maximizing Support Vector Machines (SVM) and ridge classification. Using numerical and analytical arguments, we derive the error rates as a function of the source and capacity coefficients, and contrast the two methods.
翻译:在本手稿中,我们考虑了高斯数据设计以及数据集的源数和能力假设下的内核分类问题。虽然预测误差的衰变率在更通用的内核脊回归假设下已经进行了广泛研究,但分类问题的衰变率迄今被认为是一项更具有挑战性的任务。在这项工作中,我们利用最近的分析结果为具有通用损失函数的线性分类的学习曲线提供分析结果,得出错误分类错误(预测)错误的衰变率,得出两种标准分类设置的抽样复杂性,即边缘最大化辅助矢量机(SVM)和脊脊分类。我们利用数字和分析参数得出错误率,作为源数系数的函数,并对两种方法进行对比。