加强学习关于人类决定模式的强化学习,促进独特的AI团队协作 (Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates)

In 2021 the Johns Hopkins University Applied Physics Laboratory held an internal challenge to develop artificially intelligent (AI) agents that could excel at the collaborative card game Hanabi. Agents were evaluated on their ability to play with human players whom the agents had never previously encountered. This study details the development of the agent that won the challenge by achieving a human-play average score of 16.5, outperforming the current state-of-the-art for human-bot Hanabi scores. The winning agent's development consisted of observing and accurately modeling the author's decision making in Hanabi, then training with a behavioral clone of the author. Notably, the agent discovered a human-complementary play style by first mimicking human decision making, then exploring variations to the human-like strategy that led to higher simulated human-bot scores. This work examines in detail the design and implementation of this human compatible Hanabi teammate, as well as the existence and implications of human-complementary strategies and how they may be explored for more successful applications of AI in human machine teams.

翻译：2021年,约翰·霍普金斯大学应用物理实验室在开发人造智能(AI)剂方面遇到了内部挑战,这些剂在Hanabi合作牌游戏中可以出类拔萃。这些剂被评估为他们与以前从未遇到过的人类玩家玩耍的能力。本研究详细介绍了通过达到人类玩耍平均得分16.5而赢得挑战的代理人的发展,这比目前人类玩耍平均得分高16.5分的先进水平高。获胜剂的发展包括观察和准确地模拟作者在Hanabi的决策,然后用作者的行为克隆人进行培训。值得注意的是,该剂通过首先模拟人类决策,发现了一种人造辅助游戏的风格,然后探索了导致模拟人造机器人得分更高的人型战略的变异。这项工作详细审查了这个与人兼容的Hanabi团队队队的设计和执行情况,以及人类补充战略的存在和影响,以及如何探索这些策略,以便在人体机器队中更成功地应用AI。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日