Deep reinforcement learning provides a promising approach for text-based games in studying natural language communication between humans and artificial agents. However, the generalization still remains a big challenge as the agents depend critically on the complexity and variety of training tasks. In this paper, we address this problem by introducing a hierarchical framework built upon the knowledge graph-based RL agent. In the high level, a meta-policy is executed to decompose the whole game into a set of subtasks specified by textual goals, and select one of them based on the KG. Then a sub-policy in the low level is executed to conduct goal-conditioned reinforcement learning. We carry out experiments on games with various difficulty levels and show that the proposed method enjoys favorable generalizability.
翻译:深层强化学习为基于文字的游戏研究人类和人工剂之间的自然语言交流提供了一个很有希望的方法。然而,一般化仍是一个巨大的挑战,因为代理人主要依赖培训任务的复杂性和多样性。在本文中,我们通过采用基于知识图形的RL代理的等级框架来解决这个问题。在高层次上,执行元政策,将整个游戏分解成一套由文字目标指定的子任务,并根据KG选择其中的一个。然后,在低层次上执行次级政策,进行有目标条件的强化学习。我们在各种困难层次上进行游戏实验,并表明拟议的方法具有可喜的通用性。