Markov决策过程的概率-硬性结果</s> (Positivity-hardness results on Markov decision processes)

This paper investigates a series of optimization problems for one-counter Markov decision processes (MDPs) and integer-weighted MDPs with finite state space. Specifically, it considers problems addressing termination probabilities and expected termination times for one-counter MDPs, as well as satisfaction probabilities of energy objectives, conditional and partial expectations, satisfaction probabilities of constraints on the total accumulated weight, the computation of quantiles for the accumulated weight, and the conditional value-at-risk for accumulated weights for integer-weighted MDPs. Although algorithmic results are available for some special instances, the decidability status of the decision versions of these problems is unknown in general. The paper demonstrates that these optimization problems are inherently mathematically difficult by providing polynomial-time reductions from the Positivity problem for linear recurrence sequences. This problem is a well-known number-theoretic problem whose decidability status has been open for decades and it is known that decidability of the Positivity problem would have far-reaching consequences in analytic number theory. So, the reductions presented in the paper show that an algorithmic solution to any of the investigated problems is not possible without a major breakthrough in analytic number theory. The reductions rely on the construction of MDP-gadgets that encode the initial values and linear recurrence relations of linear recurrence sequences. Interestingly, the reductions can also be extended to demonstrate the Positivity-hardness of two problems that address the long-run behavior of a system, namely the model-checking problem of frequency-LTL and the optimization of the long-run average probability to satisfy a path property. Both of these problems have been studied before on special instances, but are open in general.

翻译：本文调查了单人Markov 决策流程( MDPs) 和总重量加权 MDP 中一系列优化问题。具体地说, 本文探讨了关于单人 MDPs 的终止概率和预期终止时间的问题, 以及能源目标的满意度概率、有条件和部分期望、累积总重量限制的满意度概率、累积重量的量化值的计算, 以及整人加权的 MDPs 中累积重量的附加值。虽然有些特殊情况有算法结果, 这些问题的决定版本的衰变状态一般并不为人所知。该文件表明,这些优化问题在数学上本身就很困难, 提供了线性重现序列的多位时间减少问题。这个问题是一个众所周知的数字理论问题, 其衰减状况已经持续了几十年, 众所周知, Posititial 问题的衰变现性将产生深远影响, 解析性模型数的递减性结果是深远的。因此, 精度周期性周期的衰减状态在初步理论中表明, 方向上显示, 方向的递减是一个重大的突破性变变变的路径, 。</s>