We address the problem of prescribing an optimal decision in a framework where the cost function depends on uncertain problem parameters that need to be learned from data. Earlier work proposed prescriptive formulations based on supervised machine learning methods. These prescriptive methods can factor in contextual information on a potentially large number of covariates to take context specific actions which are superior to any static decision. When working with noisy or corrupt data, however, such nominal prescriptive methods can be prone to adverse overfitting phenomena and fail to generalize on out-of-sample data. In this paper we combine ideas from robust optimization and the statistical bootstrap to propose novel prescriptive methods which safeguard against overfitting. We show indeed that a particular entropic robust counterpart to such nominal formulations guarantees good performance on synthetic bootstrap data. As bootstrap data is often a sensible proxy to actual out-of-sample data, our robust counterpart can be interpreted to directly encourage good out-of-sample performance. The associated robust prescriptive methods furthermore reduce to convenient tractable convex optimization problems in the context of local learning methods such as nearest neighbors and Nadaraya-Watson learning. We illustrate our data-driven decision-making framework and our novel robustness notion on a small newsvendor problem.
翻译:在成本功能取决于需要从数据中学习的不确定问题参数的框架中,我们处理如何规定最佳决策的问题; 早期的工作提议基于监督的机器学习方法的规范化配方; 这些规范化方法可在背景信息中考虑到可能有大量的共变体,以采取优于任何静态决定的具体环境行动; 然而,在处理吵闹或腐败数据时,这种名义规范化方法可能会容易出现不良的过度适应现象,无法对抽样数据加以概括; 在本文件中,我们综合了来自强力优化和统计靴子的构想,以提出新的规范化方法,保护不过度配制; 我们确实表明,这种名义化配方的某个特殊的强化对应方,保证了合成靴式装配方数据的良好性。 由于靴式数据往往是实际抽样数据的合理替代物,因此我们的强健健的对应方数据可以被解释为直接鼓励良好的标本性表现。 与此相关的稳健的规范化方法还进一步将本地学习方法中的易感性调优化问题缩小,例如最近的邻居和Nadaraya-Watson学习。