In this study we analyze linear mixed-integer programming problems, in which the distribution of the cost vector is only observable through a finite training data set. In contrast to the related studies, we assume that the number of random observations for each component of the cost vector may vary. Then the goal is to find a prediction rule that converts the data set into an estimate of the expected value of the objective function and a prescription rule that provides an associated estimate of the optimal decision. We aim at finding the least conservative prediction and prescription rules, which satisfy some specified asymptotic guarantees as the sample size tends to infinity. We demonstrate that under some mild assumption the resulting vector optimization problems admit a Pareto optimal solution with some attractive theoretical properties. In particular, this solution can be obtained by solving a distributionally robust optimization (DRO) problem with respect to all probability distributions with given component-wise relative entropy distances from the empirical marginal distributions. It turns out that the outlined DRO problem can be solved rather effectively whenever there exists an effective algorithm for the respective deterministic problem. In addition, we perform numerical experiments where the out-of-sample performance of the proposed approach is analyzed.
翻译:在这项研究中,我们分析了线性混合整数编程问题,在这些问题中,成本矢量的分布只能通过有限的培训数据集来观察。与相关研究相比,我们假设成本矢量的每个组成部分随机观测的数量可能有所不同。然后的目标是找到一种预测规则,将数据集转换成对目标函数的预期值的估计,并找到一条规定规则,对最佳决定作出相关估计。我们的目标是找到最保守的预测和处方规则,这些规则满足了某些规定的零星保证,因为样本大小往往不尽相同。我们证明,在某些轻微假设下,由此产生的矢量优化问题承认了一种具有某些有吸引力的理论属性的最佳解决方案。特别是,通过解决分布稳健的优化(DRO)问题,将数据转换成对目标函数的预期值的估计,以及一项提供最佳决定的最佳估计的处方规则。我们发现,只要存在针对相关确定性问题的有效算法,就可以相当有效地解决概述的DRO问题。此外,我们还进行了数字实验,在其中我们进行了拟议的方法的极限性表现。