社会投资倡议:为深强化学习机构的社会-社区能力制定基准 (SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents)

Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI. Within the Deep Reinforcement Learning (DRL) field, this objective motivated multiple works on embodied language use. However, current approaches focus on language as a communication tool in very simplified and non-diverse social situations: the "naturalness" of language is reduced to the concept of high vocabulary size and variability. In this paper, we argue that aiming towards human-level AI requires a broader set of key social skills: 1) language use in complex and variable social contexts; 2) beyond language, complex embodied communication in multimodal settings within constantly evolving social worlds. We explain how concepts from cognitive sciences could help AI to draw a roadmap towards human-like intelligence, with a focus on its social dimensions. As a first step, we propose to expand current research to a broader set of core social skills. To do this, we present SocialAI, a benchmark to assess the acquisition of social skills of DRL agents using multiple grid-world environments featuring other (scripted) social agents. We then study the limits of a recent SOTA DRL approach when tested on SocialAI and discuss important next steps towards proficient social agents. Videos and code are available at https://sites.google.com/view/socialai.

翻译：在深入强化学习(DRL)领域,这个目标激发了多种关于语言应用的多重工作。然而,目前的方法侧重于语言作为非常简化和非多样化的社会状况中的交流工具:语言的“自然性”已沦为高词汇大小和变异的概念。在本文件中,我们提出,实现人类层面的AI需要更广泛的一套关键社会技能:(1) 在复杂和多变的社会环境中使用语言;(2) 在语言之外,在不断演变的社会世界中多式环境中的复杂通信。我们解释了认知科学的概念如何帮助AI绘制人类智能路线图,重点是其社会层面。作为第一步,我们提议将当前研究扩大到更广泛的核心社会技能。为了做到这一点,我们介绍社会AI,一个基准,用以评估DRL代理人利用多种网格-世界环境获得社会技能的情况。我们随后研究了SATA DRL方法在不断演变的社交世界环境中的局限性。我们随后在社会AI/Profigle Profiols上测试并讨论未来重要步骤。

相关内容

深度强化学习

关注 154

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

专知会员服务

110+阅读 · 2022年3月2日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日