模型不确定性下的Markov 决策过程 (Markov Decision Processes under Model Uncertainty)

We introduce a general framework for Markov decision problems under model uncertainty in a discrete-time infinite horizon setting. By providing a dynamic programming principle we obtain a local-to-global paradigm, namely solving a local, i.e., a one time-step robust optimization problem leads to an optimizer of the global (i.e. infinite time-steps) robust stochastic optimal control problem, as well as to a corresponding worst-case measure. Moreover, we apply this framework to portfolio optimization involving data of the S&P 500. We present two different types of ambiguity sets; one is fully data-driven given by a Wasserstein-ball around the empirical measure, the second one is described by a parametric set of multivariate normal distributions, where the corresponding uncertainty sets of the parameters are estimated from the data. It turns out that in scenarios where the market is volatile or bearish, the optimal portfolio strategies from the corresponding robust optimization problem outperforms the ones without model uncertainty, showcasing the importance of taking model uncertainty into account.

翻译：我们在一个离散时间无限的地平线设置中,根据模型不确定性,为Markov决定问题引入了一个总体框架。通过提供一个动态的编程原则,我们获得了一个从地方到全球的范式,即解决一个局部的,即一个时间步骤的稳健优化问题,导致优化全球(即无限时间步骤)稳健的随机最佳控制问题,以及相应的最坏措施。此外,我们将这个框架应用于涉及S & P 500数据的综合优化。我们提出了两种不同的模范;一个是完全由数据驱动的,由瓦塞斯坦球围绕经验性措施提供,第二个则由一组多变法正常分布的参数描述,根据数据来估计相应的参数的不确定性。结果显示,在市场动荡或紧张的情况下,相应的稳健优化问题的最佳组合战略超越了没有模型不确定性的组合战略,显示了将模型不确定性考虑在内的重要性。