Stochastic two-player games model systems with both adversarial and stochastic environment. The adversarial environment is modeled by a player (Player 2) who tries to prevent the system (Player 1) from achieving its objective. We consider finitary versions of the traditional mean-payoff objective, replacing the long-run average of the payoffs by payoff average computed over a finite sliding window. Two variants have been considered; in one variant, the maximum window length is fixed and given, while in the other, it is not fixed but is required to be bounded. For both variants, we present complexity bounds and algorithmic solutions for computing strategies for Player 1 to ensure that the objective is satisfied with positive probability, with probability 1, or with a probability at least $p$. The solution crucially relies on a reduction to the special case of nonstochastic two-player games. We give a general characterization of prefix-independent objectives for which this reduction holds. The positive and almost-sure decision problems are in ${\sf PTIME}$ for the fixed variant and in ${\sf NP \cap coNP}$ for the bounded variant. For arbitrary $p$, the decision problem is in ${\sf NP \cap coNP}$ for both variants, thus matching the bounds for simple stochastic games. The memory requirements for both players in stochastic games are also the same as for nonstochastic games by our reduction. Further, for nonstochastic games, we improve upon the upper bound on the memory requirement of Player 1 and the lower bound on the memory requirement of Player 2. To the best of our knowledge, this is the first work to consider stochastic games with finitary quantitative objectives.
翻译:暂无翻译