基于值迭代和对天体运动会的应用 (Universal Complexity Bounds Based on Value Iteration and Application to Entropy Games)

We develop value iteration-based algorithms to solve in a unified manner different classes of combinatorial zero-sum games with mean-payoff type rewards. These algorithms rely on an oracle, evaluating the dynamic programming operator up to a given precision. We show that the number of calls to the oracle needed to determine exact optimal (positional) strategies is, up to a factor polynomial in the dimension, of order R/sep, where the "separation" sep is defined as the minimal difference between distinct values arising from strategies, and R is a metric estimate, involving the norm of approximate sub and super-eigenvectors of the dynamic programming operator. We illustrate this method by two applications. The first one is a new proof, leading to improved complexity estimates, of a theorem of Boros, Elbassioni, Gurvich and Makino, showing that turn-based mean-payoff games with a fixed number of random positions can be solved in pseudo-polynomial time. The second one concerns entropy games, a model introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. The rank of an entropy game is defined as the maximal rank among all the ambiguity matrices determined by strategies of the two players. We show that entropy games with a fixed rank, in their original formulation, can be solved in polynomial time, and that an extension of entropy games incorporating weights can be solved in pseudo-polynomial time under the same fixed rank condition.

翻译：我们开发了基于复制值的算法, 以统一的方式解决不同等级的组合零和游戏的分类零和游戏, 并给予平均回报类型。这些算法依赖于一个甲骨文, 以给定的精确度来评估动态编程操作员。我们显示, 确定精确最佳( 位置) 战略所需的甲骨文呼叫数量, 最高是R/ sep 级的多因子倍数, 即“ 分离” 标准游戏的定义是: 战略产生的不同值之间的最小差异, R是一个指标性估计, 涉及动态编程操作员的近乎子和超级天才的规范。我们用两个应用程序来演示这个方法。第一个是新证据, 导致更复杂的估计, 是Boros、 Elbassioni、 Gurvich 和 Makino 的理论, 显示具有固定位置位置的基于转基因游戏游戏的游戏游戏, 可以在假极级时间中解决。第二个问题涉及极值游戏的模型, 由Asarinal、 Cervelle、 Degorore 和Degal commal 游戏的模型, 在固定游戏的游戏中, roal- tral- deal- commmal- stral- deminal- commal commal commstral commal commal commmmal commal commal commal commal commal commal commal commal commal commal commmstr commal commstr commstr commde commal commal commstral commstr commstrmstr commstr commstr commstr 。