Monte Carlo Tree Search (MCTS) is a sampling best-first method to search for optimal decisions. The success of MCTS depends heavily on how the tree is built and the selection process plays a fundamental role in this. One particular selection mechanism that has proved to be reliable is based on the Upper Confidence Bounds for Trees (UCT). The UCT attempts to balance exploration and exploitation by considering the values stored in the statistical tree of the MCTS. However, some tuning of the MCTS UCT is necessary for this to work well. In this work, we use Evolutionary Algorithms (EAs) to evolve mathematical expressions with the goal to substitute the UCT formula and use the evolved expressions in MCTS. More specifically, we evolve expressions by means of our proposed Semantic-inspired Evolutionary Algorithm in MCTS approach (SIEA-MCTS). This is inspired by semantics in Genetic Programming (GP), where the use of fitness cases is seen as a requirement to be adopted in GP. Fitness cases are normally used to determine the fitness of individuals and can be used to compute the semantic similarity (or dissimilarity) of individuals. However, fitness cases are not available in MCTS. We extend this notion by using multiple reward values from MCTS that allow us to determine both the fitness of an individual and its semantics. By doing so, we show how SIEA-MCTS is able to successfully evolve mathematical expressions that yield better or competitive results compared to UCT without the need of tuning these evolved expressions. We compare the performance of the proposed SIEA-MCTS against MCTS algorithms, MCTS Rapid Action Value Estimation algorithms, three variants of the *-minimax family of algorithms, a random controller and two more EA approaches. We consistently show how SIEA-MCTS outperforms most of these intelligent controllers in the challenging game of Carcassonne.
翻译:蒙特卡洛树搜索(MCTS)是寻求最佳决策的最佳第一抽样方法。 MCTS的成功在很大程度上取决于树的构建方式和选择过程在这方面发挥的根本作用。一个已经证明可靠的特定选择机制是基于树的高度信任圈。UCT试图通过考虑在MCTS统计树中储存的值来平衡勘探和开发。然而,MCTUUCT的调整是成功运作的最佳方法。在这项工作中,我们使用进化变异的卡森特(EAs)来进化数学表达方式,目的是取代UCT公式,并在MCTS中使用进化的表达方式。更具体地说,我们通过MCTS(SIEA-MCTS)的方法来演化我们所拟议的内化进化的进化和开发。在遗传学规划(GP)中的语义学学学学学学学学,我们使用这些变异异的体,在GPTA中,我们通常用进化的变异性变体模型来决定个人是否更适合个人,而在变化的变异体运动中,我们用这种变变变体的变异的变体动作来显示。