The adoption of pre-trained language models to generate action plans for embodied agents is a promising research strategy. However, execution of instructions in real or simulated environments requires verification of the feasibility of actions as well as their relevance to the completion of a goal. We propose a new method that combines a language model and reinforcement learning for the task of building objects in a Minecraft-like environment according to the natural language instructions. Our method first generates a set of consistently achievable sub-goals from the instructions and then completes associated sub-tasks with a pre-trained RL policy. The proposed method formed the RL baseline at the IGLU 2022 competition.
翻译:采用经过培训的语文模式,为有代表性的物剂制定行动计划,是一项很有希望的研究战略,然而,在实际或模拟环境中执行指示,需要核查行动的可行性及其与完成目标的相关性。我们提出了一种新方法,将语言模式和强化学习结合起来,以便根据自然语言指示在类似地雷的环境下建造物体的任务。我们的方法首先从指示中产生一套始终可以实现的次级目标,然后用经过培训的RL政策完成相关的次级任务。拟议的方法在IGLU 2022竞赛中形成了RL基线。