World models improve a learning agent's ability to efficiently operate in interactive and situated environments. This work focuses on the task of building world models of text-based game environments. Text-based games, or interactive narratives, are reinforcement learning environments in which agents perceive and interact with the world using textual natural language. These environments contain long, multi-step puzzles or quests woven through a world that is filled with hundreds of characters, locations, and objects. Our world model learns to simultaneously: (1) predict changes in the world caused by an agent's actions when representing the world as a knowledge graph; and (2) generate the set of contextually relevant natural language actions required to operate in the world. We frame this task as a Set of Sequences generation problem by exploiting the inherent structure of knowledge graphs and actions and introduce both a transformer-based multi-task architecture and a loss function to train it. A zero-shot ablation study on never-before-seen textual worlds shows that our methodology significantly outperforms existing textual world modeling techniques as well as the importance of each of our contributions.
翻译:世界模型提高了学习代理商在互动和位置环境中高效运作的能力。这项工作侧重于建立基于文字的游戏环境的世界模型的任务。基于文字的游戏或互动叙述是强化学习环境,使各种代理商使用自然语言感知和与世界互动。这些环境包含长而多步的拼图或探索,在充满数百个字符、地点和对象的世界中交织在一起。我们的世界模型同时学习:(1) 预测一个代理商在以知识图表形式代表世界时的行为引起的世界变化;(2) 产生一套在世界运行所需的符合背景的自然语言行动。我们通过利用知识图表和行动的固有结构,将这项任务作为一系列序列生成问题来设置,并引入一个基于变异器的多任务结构以及一个用于培训它的损失函数。关于从未见过的文字世界的零位缩影研究表明,我们的方法大大超越了现有的文字世界模型技术以及我们每项贡献的重要性。