Identifying the configuration of chess pieces from an image of a chessboard is a problem in computer vision that has not yet been solved accurately. However, it is important for helping amateur chess players improve their games by facilitating automatic computer analysis without the overhead of manually entering the pieces. Current approaches are limited by the lack of large datasets and are not designed to adapt to unseen chess sets. This paper puts forth a new dataset synthesised from a 3D model that is an order of magnitude larger than existing ones. Trained on this dataset, a novel end-to-end chess recognition system is presented that combines traditional computer vision techniques with deep learning. It localises the chessboard using a RANSAC-based algorithm that computes a projective transformation of the board onto a regular grid. Using two convolutional neural networks, it then predicts an occupancy mask for the squares in the warped image and finally classifies the pieces. The described system achieves an error rate of 0.23% per square on the test set, 28 times better than the current state of the art. Further, a few-shot transfer learning approach is developed that is able to adapt the inference system to a previously unseen chess set using just two photos of the starting position, obtaining a per-square accuracy of 99.83% on images of that new chess set. The code, dataset, and trained models are made available online.
翻译:从棋盘的图像中确定棋类的配置是计算机视觉中的一个问题,尚未得到准确解决。 但是,它对于帮助业余象棋玩家通过便利自动计算机分析来改进游戏,而无需手动进入棋子,对于帮助业余象棋玩家改进游戏很重要。 目前的方法因缺少大型数据集而受到限制,而且没有设计适应看不见的棋子。 本文从3D模型中合成了一个新的数据集,该模型的规模大于现有的3D模型。 对这一数据集进行了培训, 推出了一个新的端到端的象棋识别系统, 将传统的计算机视觉技术与深层学习相结合。 它使用基于RANSAC的算法将棋盘的投影转换成本地棋盘。 使用两个革命性神经网络, 然后预测扭曲图像中的方块的占用面罩, 并最终对片段进行分类。 所描述的系统在测试数据集上达到每方0. 23 % 的误差率, 比当前的状态要高28倍。 此外, 正在开发一个小点的棋盘传输方法, 正在将棋子投入常规网格, 利用先前的精确度, 83 的棋子 正在开始调整新的棋子 。