Reinforcement Learning has recently surfaced as a very powerful tool to solve complex problems in the domain of board games, wherein an agent is generally required to learn complex strategies and moves based on its own experiences and rewards received. While RL has outperformed existing state-of-the-art methods used for playing simple video games and popular board games, it is yet to demonstrate its capability on ancient games. Here, we solve one such problem, where we train our agents using different methods namely Monte Carlo, Qlearning and Expected Sarsa to learn optimal policy to play the strategic Royal Game of Ur. The state space for our game is complex and large, but our agents show promising results at playing the game and learning important strategic moves. Although it is hard to conclude that when trained with limited resources which algorithm performs better overall, but Expected Sarsa shows promising results when it comes to fastest learning.
翻译:强化学习是解决棋盘游戏领域复杂问题的一个非常有力的工具, 通常需要一名代理人学习基于自身经验和所获回报的复杂策略和动作。 虽然RL成绩优于现有最先进的游戏游戏游戏和流行棋盘游戏方法, 但它还没有在古老游戏中展示出自己的能力。 在这里, 我们解决了一个这样的问题, 我们用不同的方法, 即 Monte Carlo、 Qlearning 和预期 Sarsa 来培训我们的代理人, 来学习最佳的政策, 玩板块战略游戏。 国家游戏的空间是复杂而庞大的, 但我们的代理人在玩游戏和学习重要战略动作方面表现出有希望的结果。 尽管很难得出这样的结论, 当用有限的资源来训练算法在整体上表现更好, 但预期Sarsa 在学习得最快时会显示出有希望的结果。