Two-player (antagonistic) games on (possibly stochastic) graphs are a prevalent model in theoretical computer science, notably as a framework for reactive synthesis. Optimal strategies may require randomisation when dealing with inherently probabilistic goals, balancing multiple objectives, or in contexts of partial information. There is no unique way to define randomised strategies. For instance, one can use so-called mixed strategies or behavioural ones. In the most general settings, these two classes do not share the same expressiveness. A seminal result in game theory - Kuhn's theorem - asserts their equivalence in games of perfect recall. This result crucially relies on the possibility for strategies to use infinite memory, i.e., unlimited knowledge of all the past of a play. However, computer systems are finite in practice. Hence it is pertinent to restrict our attention to finite-memory strategies, defined as automata with outputs. Randomisation can be implemented in these in different ways: the initialisation, outputs or transitions can be randomised or deterministic respectively. Depending on which aspects are randomised, the expressiveness of the corresponding class of finite-memory strategies differs. In this work, we study two-player turn-based stochastic games and provide a complete taxonomy of the classes of finite-memory strategies obtained by varying which of the three aforementioned components are randomised. Our taxonomy holds both in settings of perfect and imperfect information.
翻译:双玩者( antaministic) 游戏在( 可能存在的不准确性) 图形上是理论计算机科学中流行的模型, 特别是作为反应合成的框架。 最佳战略可能需要随机化, 处理内在的概率目标时, 平衡多重目标, 或在部分信息背景下 。 没有独特的方法可以定义随机化战略 。 例如, 可以使用所谓的混合策略或行为策略。 在最普遍的设置中, 这两种类别不具有相同的表达性 。 游戏理论的任意性结果 - Kuhn's 理论 - 在完美回溯的游戏中显示其等同性 。 这个结果关键地依赖于战略使用无限记忆的可能性, 即对游戏过去所有时间的无限知识。 然而, 计算机系统在实践中是有限的 。 因此, 将我们的注意力限制在限定的模范战略上, 定义为输出的自自动化战略 。 调可以以不同的方式执行: 初始化、 产出或过渡可以随机化或确定性 。 这取决于什么方面是随机化的,, 以及我们之前三个阶段的税收战略 。