Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. The MCTS's popularity is based on its extraordinary results in the challenging two-player based game Go, a game considered much harder than Chess and that until very recently was considered infeasible for Artificial Intelligence methods. The success of MCTS depends heavily on how the tree is built and the selection process plays a fundamental role in this. One particular selection mechanism that has proved to be reliable is based on the Upper Confidence Bounds for Trees, commonly referred as UCT. The UCT attempts to nicely balance exploration and exploitation by considering the values stored in the statistical tree of the MCTS. However, some tuning of the MCTS UCT is necessary for this to work well. In this work, we use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the UCT mathematical expression. We compare our proposed approach, called Evolution Strategy in MCTS (ES-MCTS) against five variants of the MCTS UCT, three variants of the star-minimax family of algorithms as well as a random controller in the Game of Carcassonne. We also use a variant of our proposed EA-based controller, dubbed ES partially integrated in MCTS. We show how the ES-MCTS controller, is able to outperform all these 10 intelligent controllers, including robust MCTS UCT controllers.
翻译:蒙特卡洛树搜索(MCTS)是寻找最佳决定的最佳第一方法。 MCTS的受欢迎程度基于其具有挑战性的双球游戏Go的非凡结果。Go是一个比象棋更难看的游戏,直到最近才被认为对人工智能方法不可行。MCTS的成功在很大程度上取决于树的构造和选择过程在这方面发挥根本作用。一个被证明可靠的特定选择机制是基于树的高度信任圈,通常被称为UCT。UCT试图通过考虑MCTS统计树中储存的数值来平衡探索和开发。然而,对MCTS UCT进行某些调整对于这项工作的顺利运作是必要的。在这项工作中,我们使用进化的Agorths(EAs)来演化数学表达方式,以取代UCT数学表达方式。我们比较了我们所提议的方法,即所谓的MCTS的高级智能战略(ES-MCTS)与MCT的五种变体。UCT试图平衡MCT的三种变体,而MCT的3种变体,这三种变体,包括ES-CCT(ES-COL)的变体,也是我们A-CLICLA的变体,我们在10级的变体中,我们提出的S-CLICLIA中,我们提议的A的变体的变体,也展示了我们提议的系统的变体。