有限手段的无限使用:使用《组装新议定书》的零热一般化 (Infinite use of finite means: Zero-Shot Generalization using Compositional Emergent Protocols)

Human language has been described as a system that makes \textit{use of finite means to express an unlimited array of thoughts}. Of particular interest is the aspect of compositionality, whereby, the meaning of a compound language expression can be deduced from the meaning of its constituent parts. If artificial agents can develop compositional communication protocols akin to human language, they can be made to seamlessly generalize to unseen combinations. However, the real question is, how do we induce compositionality in emergent communication? Studies have recognized the role of curiosity in enabling linguistic development in children. It is this same intrinsic urge that drives us to master complex tasks with decreasing amounts of explicit reward. In this paper, we seek to use this intrinsic feedback in inducing a systematic and unambiguous protolanguage in artificial agents. We show how these rewards can be leveraged in training agents to induce compositionality in absence of any external feedback. Additionally, we introduce gComm, an environment for investigating grounded language acquisition in 2D-grid environments. Using this, we demonstrate how compositionality can enable agents to not only interact with unseen objects but also transfer skills from one task to another in a zero-shot setting: \textit{Can an agent, trained to `pull' and `push twice', `pull twice'?}.

翻译：人类语言被描述为一个使\ textit{ 使用有限手段表达无限各种思想的系统。特别令人感兴趣的是组成性, 即复合语言表达的含义可以从其组成部分的含义中推断出来。如果人工代理商能够制定与人类语言相近的合成通信协议, 他们就可以被完美地概括为看不见的组合。但是, 真正的问题是, 我们如何在突发的通信中产生成份性? 研究已经认识到好奇心在帮助儿童语言发展方面的作用。正是这种内在的冲动促使我们掌握复杂的任务, 并减少明确的奖赏数量。在本文中, 我们寻求利用这种内在反馈来引导人工代理商系统而明确的原语言。我们展示这些奖赏如何在培训代理商在没有任何外部反馈的情况下被利用来诱导成构成性。此外, 我们引入 GCommal, 一个在2D- griw 环境中调查基于语言获取的环境。我们以此来证明, 组合性可以让代理人不仅与看不见的物体互动, 而且还将技能从一个任务转移到另一个任务中“ 两次打印 ” 。