Lewis signaling games are a class of simple communication games for simulating the emergence of language. In these games, two agents must agree on a communication protocol in order to solve a cooperative task. Previous work has shown that agents trained to play this game with reinforcement learning tend to develop languages that display undesirable properties from a linguistic point of view (lack of generalization, lack of compositionality, etc). In this paper, we aim to provide better understanding of this phenomenon by analytically studying the learning problem in Lewis games. As a core contribution, we demonstrate that the standard objective in Lewis games can be decomposed in two components: a co-adaptation loss and an information loss. This decomposition enables us to surface two potential sources of overfitting, which we show may undermine the emergence of a structured communication protocol. In particular, when we control for overfitting on the co-adaptation loss, we recover desired properties in the emergent languages: they are more compositional and generalize better.
翻译:Lewis 信号游戏是模拟语言出现的一种简单的交流游戏。 在这些游戏中, 两个代理商必须就通信协议达成一致, 以便解决合作性任务 。 先前的工作表明, 受过强化学习训练的玩游戏的代理商倾向于开发从语言角度显示不受欢迎的语言( 缺乏概括性、 缺乏构成性等 ) 。 在本文中, 我们的目标是通过分析研究 Lewis 游戏的学习问题来更好地了解这一现象。 作为核心贡献, 我们证明 Lewis 游戏的标准目标可以分解为两个部分: 共同适应损失和信息损失 。 这种分解让我们能够展示两个潜在的超配源, 我们显示这可能会破坏结构化通信协议的出现。 特别是当我们控制对合适应损失的过度时, 我们恢复了新出现语言中想要的属性: 它们是更精细的组合性, 更普及性更强。