We study a finite-horizon two-person zero-sum risk-sensitive stochastic game for continuous-time Markov chains and Borel state and action spaces, in which payoff rates, transition rates and terminal reward functions are allowed to be unbounded from below and from above and the policies can be history-dependent. Under suitable conditions, we establish the existence of a solution to the corresponding Shapley equation (SE) by an approximation technique. Then, by the SE and the extension of the Dynkin's formula, we prove the existence of a Nash equilibrium and verify that the value of the stochastic game is the unique solution to the SE. Moreover, we develop a value iteration-type algorithm for approaching to the value of the stochastic game. The convergence of the algorithm is proved by a special contraction operator in our risk-sensitive stochastic game. Finally, we demonstrate our main results by two examples.
翻译:我们为连续时间的Markov链条和Borel州及行动空间研究一个对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感、对风险敏感等的限值的游戏,对连续时间Markov链条和Borel州及行动空间进行限值研究,允许从下到上、对回报率、过渡率和终极奖励功能不设限制,允许从上到上,对政策视历史而定。在适当条件下,我们通过近似技术,对相应的变相方方方方确定存在一个解决方案。然后,通过SEEE和Dynkin公式的延伸,我们证明纳什均衡的存在,并核实随机性游戏的价值是SEE的唯一解决办法。此外,我们还开发了一种价值迭交式套式的增值法算算算算算算算法,用两个例子证明我们的主要结果。此外,我们用两个例子展示。我们用特别收缩算算法的缩算。我们证明。