Solving jigsaw puzzles requires to grasp the visual features of a sequence of patches and to explore efficiently a solution space that grows exponentially with the sequence length. Therefore, visual deep reinforcement learning (DRL) should answer this problem more efficiently than optimization solvers coupled with neural networks. Based on this assumption, we introduce Alphazzle, a reassembly algorithm based on single-player Monte Carlo Tree Search (MCTS). A major difference with DRL algorithms lies in the unavailability of game reward for MCTS, and we show how to estimate it from the visual input with neural networks. This constraint is induced by the puzzle-solving task and dramatically adds to the task complexity (and interest!). We perform an in-deep ablation study that shows the importance of MCTS and the neural networks working together. We achieve excellent results and get exciting insights into the combination of DRL and visual feature learning.
翻译:解决 jigsaw 拼图需要掌握一个补丁序列的视觉特征, 并有效地探索一个随着序列长度的倍增而增长的解决方案空间。 因此, 视觉深层强化学习( DRL) 要比优化解决方案和神经网络更高效地解决这个问题。 基于这一假设, 我们引入了阿尔法瓦兹, 这是一种基于单玩家蒙特卡洛树搜索( MCTS ) 的重新组合算法。 DRL 算法的一大区别在于没有为 MCTS 提供游戏奖赏, 我们展示了如何从神经网络的视觉输入中估算它。 这一制约是由解谜任务引发的, 并且极大地增加了任务的复杂性( 兴趣! ) 。 我们进行了一个深层的振动研究, 显示 MCTS 和神经网络一起工作的重要性 。 我们取得了出色的成果, 并获得了对 DRL 和视觉特征学习组合的令人兴奋的洞察力。