While value iteration (VI) is a standard solution approach to simple stochastic games (SSGs), it suffered from the lack of a stopping criterion. Recently, several solutions have appeared, among them also "optimistic" VI (OVI). However, OVI is applicable only to one-player SSGs with no end components. We lift these two assumptions, making it available to general SSGs. Further, we utilize the idea in the context of topological VI, where we provide an efficient precise solution. In order to compare the new algorithms with the state of the art, we use not only the standard benchmarks, but we also design a random generator of SSGs, which can be biased towards various types of models, aiding in understanding the advantages of different algorithms on SSGs.
翻译:虽然价值迭代(VI)是简单随机游戏的标准解决方案,但缺乏停止标准。最近出现了若干解决方案,其中包括“乐观”六(OVI)。然而,OVI只适用于没有最终组件的单玩者 SSG(OVI),我们解除了这两个假设,将其提供给一般SG(SSG)。此外,我们从表层六的角度利用了这个理念,我们提供了高效的精确解决方案。为了将新算法与最新技术进行比较,我们不仅使用标准基准,而且还设计了随机生成SSG(SSG)的生成器,这可能偏向于各种模型,帮助理解SSG(SSG)上不同算法的优势。