Text-based games (TBGs) have become a popular proving ground for the demonstration of learning-based agents that make decisions in quasi real-world settings. The crux of the problem for a reinforcement learning agent in such TBGs is identifying the objects in the world, and those objects' relations with that world. While the recent use of text-based resources for increasing an agent's knowledge and improving its generalization have shown promise, we posit in this paper that there is much yet to be learned from visual representations of these same worlds. Specifically, we propose to retrieve images that represent specific instances of text observations from the world and train our agents on such images. This improves the agent's overall understanding of the game 'scene' and objects' relationships to the world around them, and the variety of visual representations on offer allow the agent to generate a better generalization of a relationship. We show that incorporating such images improves the performance of agents in various TBG settings.
翻译:以文字为基础的游戏(TBGs)已成为展示在准现实世界环境中作出决策的以学习为基础的代理人的受欢迎的证明基础。 在这种以文字为基础的游戏中,强化学习代理人问题的关键在于查明世界上的物体,以及这些物体与这个世界的关系。虽然最近使用以文字为基础的资源来增加代理人的知识并改进其普遍性已显示出希望,但我们在本文中假定,从这些相同的世界的视觉表现中还有很多东西有待学习。具体地说,我们提议检索代表世界文本观测具体实例的图像,并培训我们的代理人掌握这些图像。这提高了代理人对游戏的“采样”和对象与周围世界的关系的总体理解,以及所提供的各种视觉表现使代理人能够产生更好的一种关系的概括性。我们表明,纳入这些图像可以改善各种TBG环境中的代理人的性能。