Deception is prevalent in human social settings. However, studies into the effect of deception on reinforcement learning algorithms have been limited to simplistic settings, restricting their applicability to complex real-world problems. This paper addresses this by introducing a new mixed competitive-cooperative multi-agent reinforcement learning (MARL) environment inspired by popular role-based deception games such as Werewolf, Avalon, and Among Us. The environment's unique challenge lies in the necessity to cooperate with other agents despite not knowing if they are friend or foe. Furthermore, we introduce a model of deception, which we call Bayesian belief manipulation (BBM) and demonstrate its effectiveness at deceiving other agents in this environment while also increasing the deceiving agent's performance.
翻译:然而,关于欺骗对强化学习算法的影响的研究仅限于简单化的环境,限制了这些算法对复杂的现实世界问题的适用性。本文件通过引入一种新的混合竞争-合作多试剂强化学习环境(MARL)来解决这一问题,这种环境受到流行的以角色为基础的欺骗游戏的启发,如狼人游戏、亚法隆游戏和我们等。环境的独特挑战在于必须与其他代理人合作,尽管不知道他们是朋友还是敌人。此外,我们引入了一个欺骗模式,我们称之为巴耶斯人信仰操纵(BBM),并表明它能够有效地欺骗这种环境中的其他代理人,同时提高欺骗者的表现。