The window mechanism was introduced by Chatterjee et al. to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games because it enables reasoning about such time bounds in system specifications, but also thanks to the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering Markov decision processes. A fundamental problem in this context is the threshold probability problem: given an objective it aims to synthesize strategies that guarantee satisfying runs with a given probability. We solve it for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models.
翻译:由Chatterjee et al. 引入了窗口机制, 目的是用时间限制来加强经典游戏目标。 它允许将显示在可配置时间框架内可接受行为的系统控制器合成为可接受行为, 并随其无限执行, 与只要求限制中行为正确性的传统目标形成对比。 窗口概念证明了它在各种双玩零和游戏中的利益, 因为它有助于在系统规格中推理这种时间限制, 但也由于它通常产生的可移动性增加。 在这项工作中, 我们通过考虑Markov 决策程序, 将窗口框架扩展至随机环境。 这方面的一个基本问题是临界概率问题: 因为它的目标是综合战略, 保证以给定的概率运行。 我们解决了通常的窗口目标变式, 即时间框架被设定为参数, 或者我们问是否存在这样的时间框架。 我们为基于窗口的目标开发了一种通用的方法, 并在游戏中已经考虑过的经典平均报酬和对等目标进行即刻。 我们的工作为广泛使用窗口机制铺平了道路。