Weighted timed games are two-player zero-sum games played in a timed automaton equipped with integer weights. We consider optimal reachability objectives, in which one of the players, that we call Min, wants to reach a target location while minimising the cumulated weight. While knowing if Min has a strategy to guarantee a value lower than a given threshold is known to be undecidable (with two or more clocks), several conditions, one of them being the divergence, have been given to recover decidability. In such weighted timed games (like in untimed weighted games in the presence of negative weights), Min may need finite memory to play (close to) optimally. This is thus tempting to try to emulate this finite memory with other strategic capabilities. In this work, we allow the players to use stochastic decisions, both in the choice of transitions and of timing delays. We give for the first time a definition of the expected value in weighted timed games, overcoming several theoretical challenges. We then show that, in divergent weighted timed games, the stochastic value is indeed equal to the classical (deterministic) value, thus proving that Min can guarantee the same value while only using stochastic choices, and no memory.
翻译:权重时限博弈是在定时自动机上玩的两个零和游戏,自动机带有整数权重。我们考虑最优可达性目标,在此目标中,我们称其中之一的玩家为Min,他想要到达目标位置,同时使累积的权重最小化。虽然已知Min是否有策略以保证小于给定阈值的价值是不可判定的(存在两个或更多时钟),但是已经给出了一些条件,其中之一是发散,以恢复可判定性。在这样的权重时限博弈(与存在负权重的无定时权重博弈一样),Min可能需要有限的记忆来实现(接近)最优。因此,尝试用其他战略能力模拟有限记忆是很有诱惑力的。在这项工作中,我们允许玩家在选择跃迁和定时延迟时使用随机决策。我们首次给出了权重时限博弈中预期值的定义,克服了几个理论挑战。然后,我们证明在发散的权重时限博弈中,随机值确实等于经典(确定性)值,从而证明Min可以只使用随机选择而无需记忆来保证相同的价值。