Surrogate risk minimization is an ubiquitous paradigm in supervised machine learning, wherein a target problem is solved by minimizing a surrogate loss on a dataset. Surrogate regret bounds, also called excess risk bounds, are a common tool to prove generalization rates for surrogate risk minimization. While surrogate regret bounds have been developed for certain classes of loss functions, such as proper losses, general results are relatively sparse. We provide two general results. The first gives a linear surrogate regret bound for any polyhedral (piecewise-linear and convex) surrogate, meaning that surrogate generalization rates translate directly to target rates. The second shows that for sufficiently non-polyhedral surrogates, the regret bound is a square root, meaning fast surrogate generalization rates translate to slow rates for the target. Together, these results suggest polyhedral surrogates are optimal in many cases.
翻译:顶替风险最小化是受监督的机器学习的无处不在的模式,通过将代用损失在数据集中最小化来解决目标问题。代用遗憾界限,也称为超重风险界限,是证明代用风险最小化的通用率的常用工具。虽然代用遗憾界限是为某些类别的损失功能开发的,如适当损失,但一般结果相对较少。我们提供了两个一般性结果。第一个结果是线性代用遗憾,任何多面(半线和锥形)代用模型,这意味着代用一般化比率直接转化为目标比率。第二个结果表明,对于足够非多面类的代用机器人而言,后一种遗憾界限是平方根,这意味着快速代用通用率转化为目标的慢速率。这些结果共同表明,多面代用代用模型在许多情况下是最佳的。