Text-based adventure games provide a platform on which to explore reinforcement learning in the context of a combinatorial action space, such as natural language. We present a deep reinforcement learning architecture that represents the game state as a knowledge graph which is learned during exploration. This graph is used to prune the action space, enabling more efficient exploration. The question of which action to take can be reduced to a question-answering task, a form of transfer learning that pre-trains certain parts of our architecture. In experiments using the TextWorld framework, we show that our proposed technique can learn a control policy faster than baseline alternatives. We have also open-sourced our code at https://github.com/rajammanabrolu/KG-DQN.
翻译:基于文本的冒险游戏提供了一个平台,用以在组合动作空间的背景下探索强化学习,例如自然语言。我们展示了一个深强化学习结构,作为探索期间学习的知识图表,代表游戏状态。这个图用于缩小行动空间,以便进行更有效的探索。可以将哪些行动问题简化为问答任务,一种传授我们结构中某些部分之前的学习形式。在使用TextWorld框架的实验中,我们展示了我们提议的技术能够学习比基线替代方法更快的控制政策。我们还在 https://github.com/rajammanabrolu/KG-DQN 上打开了我们的代码源码。