We consider data-driven approaches that integrate a machine learning prediction model within distributionally robust optimization (DRO) given limited joint observations of uncertain parameters and covariates. Our framework is flexible in the sense that it can accommodate a variety of regression setups and DRO ambiguity sets. We investigate asymptotic and finite sample properties of solutions obtained using Wasserstein, sample robust optimization, and phi-divergence-based ambiguity sets within our DRO formulations, and explore cross-validation approaches for sizing these ambiguity sets. Through numerical experiments, we validate our theoretical results, study the effectiveness of our approaches for sizing ambiguity sets, and illustrate the benefits of our DRO formulations in the limited data regime even when the prediction model is misspecified.
翻译:我们考虑了将机器学习预测模型纳入分布式强力优化(DRO)中的数据驱动方法,因为对不确定参数和共变因素的联合观察有限。我们的框架是灵活的,因为它可以容纳各种回归设置和DRO模棱两可的组合。我们调查了利用瓦瑟斯坦(Wasserstein)获得的解决方案的无症状和有限的样本性质、强效优化样本和基于视窗的模棱两可的模棱两可特征,并探索了缩小这些模棱两可的交叉校准方法。我们通过数字实验验证了我们的理论结果,研究了我们将模棱两可的组合化方法的有效性,并说明了即使在预测模型被错误描述的情况下,我们的DRO配方在有限的数据制度中的好处。