学习反映:数据驱动的随机控制战略的统一方法 (Learning to reflect: A unifying approach for data-driven stochastic control strategies)

Stochastic optimal control problems have a long tradition in applied probability, with the questions addressed being of high relevance in a multitude of fields. Even though theoretical solutions are well understood in many scenarios, their practicability suffers from the assumption of known dynamics of the underlying stochastic process, raising the statistical challenge of developing purely data-driven strategies. For the mathematically separated classes of continuous diffusion processes and L\'evy processes, we show that developing efficient strategies for related singular stochastic control problems can essentially be reduced to finding rate-optimal estimators with respect to the sup-norm risk of objects associated to the invariant distribution of ergodic processes which determine the theoretical solution of the control problem. From a statistical perspective, we exploit the exponential $\beta$-mixing property as the common factor of both scenarios to drive the convergence analysis, indicating that relying on general stability properties of Markov processes is a sufficiently powerful and flexible approach to treat complex applications requiring statistical methods. We show moreover that in the L\'evy case $-$ even though per se jump processes are more difficult to handle both in statistics and control theory $-$ a fully data-driven strategy with regret of significantly better order than in the diffusion case can be constructed.

翻译：在应用概率方面,托盘最佳控制问题有着悠久的传统,所处理的问题在许多领域都具有高度相关性。尽管理论解决办法在许多情况中都非常了解,但理论解决办法的实用性取决于对基础随机过程已知动态的假设,这增加了制定纯数据驱动战略的统计挑战。对于连续扩散过程和L\'evy过程的数学分解类别,我们表明,为相关的奇特随机控制问题制定有效的战略,基本上可以降低到找到与确定控制问题理论解决办法的异变分布相关对象的超温风险最高估测器。从统计角度看,我们利用指数$\beeta美元混合属性作为两种假设的共同因素推动趋同分析,表明依靠Markov过程的一般稳定性是处理需要统计方法的复杂应用的足够有力和灵活的方法。此外,在L\'evy案件中,即使每次跳动过程都更难处理统计和控制理论软件的分散性,但在数据战略中,完全以美元驱动的分散性更强。