Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Go's high branching factor makes traditional search techniques ineffective, even on leading-edge hardware, and Go's evaluation function could change drastically with one stone change. Recent works [Maddison et al. (2015); Clark & Storkey (2015)] show that search is not strictly necessary for machine Go players. A pure pattern-matching approach, based on a Deep Convolutional Neural Network (DCNN) that predicts the next move, can perform as well as Monte Carlo Tree Search (MCTS)-based open source Go engines such as Pachi [Baudis & Gailly (2012)] if its search budget is limited. We extend this idea in our bot named darkforest, which relies on a DCNN designed for long-term predictions. Darkforest substantially improves the win rate for pattern-matching approaches against MCTS-based approaches, even with looser search budgets. Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players. Adding MCTS to darkfores2 creates a much stronger player named darkfmcts3: with 5000 rollouts, it beats Pachi with 10k rollouts in all 250 games; with 75k rollouts it achieves a stable 5d level in KGS server, on par with state-of-the-art Go AIs (e.g., Zen, DolBaram, CrazyStone) except for AlphaGo [Silver et al. (2016)]; with 110k rollouts, it won the 3rd place in January KGS Go Tournament.
翻译:在古老的Go游戏中与顶级人类玩家竞争是一个长期的人工智能目标。 Go的高分支因素使得传统的搜索技术无效,即使是在领先的硬件上也是如此,而 Go 的评价职能则会随一石头的变化而发生急剧变化。 最近的工作[Maddison 等人(2015年); Clark & Storkey(2015年 ) ) 显示,对于机器 Go 玩家来说,搜索并非绝对必要。 基于深革命神经网络(DCNNN)的纯模式匹配方法可以预测下一轮运动,可以运行,以及基于蒙特卡洛·树搜索(MCTS)的黑暗源 Go 引擎,如 Pachi [Barudis & Gailly(2012年 ) 等,如果其搜索预算有限的话,则会无效。 我们扩展了我们的机器名为黑暗森林的这个想法, 因为它依靠一个为长期预测设计的 DCNNNN( Go) 。 暗森林大大提高了模式匹配方法的双向基于 MC 方法的赢率率, 即使搜索预算不严谨。 对于人类玩家来说,, 新版本, 黑暗 2, 在KGS- gold 服务器上实现3 3 3级的K- sold 服务器上稳定的3级, 它的滚动的滚动。