Hex is a complex game with a high branching factor. For the first time Hex is being attempted to be solved without the use of game tree structures and associated methods of pruning. We also are abstaining from any heuristic information about Virtual Connections or Semi Virtual Connections which were previously used in all previous known computer versions of the game. The H-search algorithm which was the basis of finding such connections and had been used with success in previous Hex playing agents has been forgone. Instead what we use is reinforcement learning through self play and approximations through neural networks to by pass the problem of high branching factor and maintaining large tables for state-action evaluations. Our code is based primarily on NeuroHex. The inspiration is drawn from the recent success of AlphaGo Zero.
翻译:十六进制是一个复杂的游戏, 其分支系数很高。 我们第一次尝试在不使用游戏树结构和相关裁剪方法的情况下解决十六进制。 我们也在不使用以前所有已知的游戏计算机版本中曾经使用的虚拟连接或半虚拟连接的信息。 H搜索算法是找到这种连接的基础,在以前的十六进制游戏代理器中已经成功使用过。 相反,我们使用的是通过神经网络的自玩和近似来强化学习,通过神经网络通过高分解因子问题和保持国家行动评价的大表来传递。 我们的代码主要基于 NeuroHex 。 灵感来自最近AlphaGo Zero 的成功 。