We study the problem of designing optimal learning and decision-making formulations when only historical data is available. Prior work typically commits to a particular class of data-driven formulation and subsequently tries to establish out-of-sample performance guarantees. We take here the opposite approach. We define first a sensible yard stick with which to measure the quality of any data-driven formulation and subsequently seek to find an optimal such formulation. Informally, any data-driven formulation can be seen to balance a measure of proximity of the estimated cost to the actual cost while guaranteeing a level of out-of-sample performance. Given an acceptable level of out-of-sample performance, we construct explicitly a data-driven formulation that is uniformly closer to the true cost than any other formulation enjoying the same out-of-sample performance. We show the existence of three distinct out-of-sample performance regimes (a superexponential regime, an exponential regime and a subexponential regime) between which the nature of the optimal data-driven formulation experiences a phase transition. The optimal data-driven formulations can be interpreted as a classically robust formulation in the superexponential regime, an entropic distributionally robust formulation in the exponential regime and finally a variance penalized formulation in the subexponential regime. This final observation unveils a surprising connection between these three, at first glance seemingly unrelated, data-driven formulations which until now remained hidden.
翻译:在只有历史数据的情况下,我们研究设计最佳学习和决策方法的问题; 以往的工作通常致力于特定一类数据驱动的拟订,然后试图建立不完全的绩效保障; 我们在此采取相反的做法; 我们首先确定一个明智的院子杆,用来衡量任何数据驱动的拟订的质量,然后寻求最佳的拟订; 非正式地,任何数据驱动的拟订,都可以在数据驱动的最佳拟订的性质与实际成本的接近程度之间取得平衡,同时保证不完全的性能水平; 鉴于一种可接受的不完全的性能水平,我们明确地设计一种数据驱动的拟订,它与任何其他具有同样不完全性业绩的拟订一致接近于真实成本; 我们首先确定一个明智的院子杆杆杆杆杆杆杆杆杆杆杆杆杆,用来衡量任何数据拟订的质量,然后设法找到最佳数据驱动的拟订的性质,同时保证不完全的性能; 最佳数据驱动的拟订可被解释为目前最典型的不精确的、最后的、最后的、最后的、最后的、最后的、最后的、令人震撼动的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、令人震动式的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、令人震动动的、令人动的、最后的、最后的、最后的、令人动的、由的、由制式的、由制式的、在最后的、最后的、最后的、最后的、最后的、最后的、最后的、最后的、由制模的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由的、由