Given a Markov decision process (MDP) and a linear-time ($\omega$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification. More recently, problems that reason over the asymptotic behavior of systems have been proposed through the lens of steady-state planning. This entails finding a control policy for an MDP such that the Markov chain induced by the solution policy satisfies a given set of constraints on its steady-state distribution. This paper studies a generalization of the controller synthesis problem for a linear-time specification under steady-state constraints on the asymptotic behavior. We present an algorithm to find a deterministic policy satisfying $\omega$-regular and steady-state constraints by characterizing the solutions as an integer linear program, and experimentally evaluate our approach.
翻译:鉴于Markov决策程序(MDP)和线性时间(omerga$-manual or LTL)的规格,控制器合成问题旨在计算符合规格的最佳政策。最近,通过稳定状态规划的角度提出了系统无症状行为引起的问题。这需要找到一种MDP的控制政策,使解决方案政策引发的Markov链能满足稳定状态分布的一套特定限制。本文研究了控制器合成问题在稳定状态制约下对无症状行为的线性时间规格的概括性。我们提出了一个算法,通过将解决方案定性为整数线性方案,并实验性地评估我们的方法,找到一种符合美元-omga$-manga-rance-state限制的确定性政策。