预测前优化框架中的普及环 (Generalization Bounds in the Predict-then-Optimize Framework)

The predict-then-optimize framework is fundamental in many practical settings: predict the unknown parameters of an optimization problem, and then solve the problem using the predicted values of the parameters. A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters. This loss function was recently introduced in Elmachtoub and Grigas (2017) and referred to as the Smart Predict-then-Optimize (SPO) loss. In this work, we seek to provide bounds on how well the performance of a prediction model fit on training data generalizes out-of-sample, in the context of the SPO loss. Since the SPO loss is non-convex and non-Lipschitz, standard results for deriving generalization bounds do not apply. We first derive bounds based on the Natarajan dimension that, in the case of a polyhedral feasible region, scale at most logarithmically in the number of extreme points, but, in the case of a general convex feasible region, have linear dependence on the decision dimension. By exploiting the structure of the SPO loss function and a key property of the feasible region, which we denote as the strength property, we can dramatically improve the dependence on the decision and feature dimensions. Our approach and analysis rely on placing a margin around problematic predictions that do not yield unique optimal solutions, and then providing generalization bounds in the context of a modified margin SPO loss function that is Lipschitz continuous. Finally, we characterize the strength property and show that the modified SPO loss can be computed efficiently for both strongly convex bodies and polytopes with an explicit extreme point representation.

翻译：预测- 最佳化框架在许多实际环境中至关重要: 预测一个优化问题的未知参数, 然后用参数的预测值解决问题。这个环境中的自然损失函数是考虑预测参数引起的决定成本, 与参数的预测错误相对照。这个损失函数最近引入了Elmachtoub和Grigas( 2017年), 被称为智能预测- 最佳化( SPO) 损失。在这项工作中, 我们试图提供以下界限: 适用于培训数据的预测模型的性能, 在 SPO 损失的背景下, 概括地差值。由于SPO 损失是非cavex 和非Lipschitz 的, 得出得出总结果。我们首先根据Natarajan 的维值, 在一个多功能可行的区域, 在最有逻辑性的极端点数中, 我们的预测模型在一般的可操作性差值范围内, 直线线性依赖SPO 损失结构的稳定性, 利用我们最精确性地值和最精确的数值的数值, 我们的数值的数值分析, 将SPO 的数值的稳定性的稳定性的稳定性的稳定性和的稳定性的稳定性的稳定性的稳定性的稳定性的稳定性和的精确性能的稳定性的的的的的的的的的的的的的的的的的的的的的的的的的的直径直径直线性能性能的的的的的的的和的的的的的的的的的的的的的的的的的的的的和的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日