SwarmPlay:由加强学习推动的与纳诺无人驾驶航空器的摇盘互动的Tic-tac-toe棋盘游戏 (SwarmPlay: Interactive Tic-tac-toe Board Game with Swarm of Nano-UAVs driven by Reinforcement Learning)

from arxiv, Accepted to the 30th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). 2021. IEEE copyright. arXiv admin note: substantial text overlap with arXiv:2108.00488

Reinforcement learning (RL) methods have been actively applied in the field of robotics, allowing the system itself to find a solution for a task otherwise requiring a complex decision-making algorithm. In this paper, we present a novel RL-based Tic-tac-toe scenario, i.e. SwarmPlay, where each playing component is presented by an individual drone that has its own mobility and swarm intelligence to win against a human player. Thus, the combination of challenging swarm strategy and human-drone collaboration aims to make the games with machines tangible and interactive. Although some research on AI for board games already exists, e.g., chess, the SwarmPlay technology has the potential to offer much more engagement and interaction with the user as it proposes a multi-agent swarm instead of a single interactive robot. We explore user's evaluation of RL-based swarm behavior in comparison with the game theory-based behavior. The preliminary user study revealed that participants were highly engaged in the game with drones (70% put a maximum score on the Likert scale) and found it less artificial compared to the regular computer-based systems (80%). The affection of the user's game perception from its outcome was analyzed and put under discussion. User study revealed that SwarmPlay has the potential to be implemented in a wider range of games, significantly improving human-drone interactivity.

翻译：在机器人领域积极应用强化学习(RL)方法,使系统本身能够找到一个解决方案来完成需要复杂决策算法的任务。在本文中,我们展示了一个新的基于RL的Tic-tac-toe情景,即SwararmPlay,其中每个游戏部件都由个人无人驾驶飞机提出,该无人驾驶飞机有自己的机动性和群集智能,可以战胜一个人类玩家。因此,具有挑战性的群温策略和人类潮流协作相结合,目的是让与机器的游戏变得有形和互动。尽管在游戏游戏游戏的AI上已经存在一些研究,例如象棋,但是SwarmPlay技术在提出多试剂的电磁共振时,有可能与用户进行更多的接触和互动。我们研究了用户对基于游戏理论的行为所作的评估。初步用户研究显示,参与者与无人驾驶飞机的游戏(70 % 将最大分数加到艾斯特特级),但发现与普通的计算机游戏机场间分析结果相比,Swarm-Pl-real 已经进行了大量改进。