Support vector machine (SVM) is a powerful classification method that has achieved great success in many fields. Since its performance can be seriously impaired by redundant covariates, model selection techniques are widely used for SVM with high dimensional covariates. As an alternative to model selection, significant progress has been made in the area of model averaging in the past decades. Yet no frequentist model averaging method was considered for SVM. This work aims to fill the gap and to propose a frequentist model averaging procedure for SVM which selects the optimal weight by cross validation. Even when the number of covariates diverges at an exponential rate of the sample size, we show asymptotic optimality of the proposed method in the sense that the ratio of its hinge loss to the lowest possible loss converges to one. We also derive the convergence rate which provides more insights to model averaging. Compared to model selection methods of SVM which require a tedious but critical task of tuning parameter selection, the model averaging method avoids the task and shows promising performances in the empirical studies.
翻译:支持矢量机(SVM)是一种强有力的分类方法,在许多领域都取得了巨大成功。由于其性能可能因冗余的共变而严重受损,模型选择技术被广泛用于具有高维共变的SVM。作为模型选择的替代办法,在过去几十年中平均在模型领域取得了显著进展。但是没有考虑SVM的常住模式平均方法。这项工作的目的是填补空白,提出SVM的常住模式平均程序,通过交叉验证选择最佳重量。即使共变数的数量以样本大小的指数速度变化,我们也显示拟议方法的不尽最佳性,因为其断线损失与尽可能最低损失的比率会趋同于一个模式。我们还得出趋同率,为模型平均提供更深刻的见解。与SVM的模型选择方法相比,该模型平均方法需要调整参数选择的复杂而关键任务,因此避免了任务,并显示实验研究中的良好表现。