Strategy iteration is a technique frequently used for two-player games in order to determine the winner or compute payoffs, but to the best of our knowledge no general framework for strategy iteration has been considered. Inspired by previous work on simple stochastic games, we propose a general formalisation of strategy iteration for solving least fixpoint equations over a suitable class of complete lattices, based on MV-chains. We devise algorithms that can be used for non-expansive fixpoint functions represented as so-called min-, respectively, max-decompositions. Correspondingly, we develop two different techniques: strategy iteration from above, which has to solve the problem that iteration might reach a fixpoint that is not the least, and from below, which is algorithmically simpler, but requires a more involved correctness argument. We apply our method to solve energy games and compute behavioural metrics for probabilistic automata.
翻译:战略迭代是一种常用于双玩游戏的技术,目的是确定赢家或计算报酬,但据我们所知,没有考虑过任何战略迭代的总体框架。在以往简单随机游戏工作启发下,我们提议对战略迭代进行总体正规化,以便根据机动车链,解决适合的完整拉特方程的最小固定点方程,并基于机动车链。我们设计了可用于代表所谓的微调(即最大分解)的非扩展固定点功能的算法。我们相应地开发了两种不同的技术:从上到下,战略迭代必须解决迭代可能达到非最小固定点的问题,从下到下,这在算学上比较简单,但需要更相关的正确性论证。我们运用了我们的方法来解决能源游戏,并计算概率性自动数据的行为指标。