We study building a multi-task agent in Minecraft. Without human demonstrations, solving long-horizon tasks in this open-ended environment with reinforcement learning (RL) is extremely sample inefficient. To tackle the challenge, we decompose solving Minecraft tasks into learning basic skills and planning over the skills. We propose three types of fine-grained basic skills in Minecraft, and use RL with intrinsic rewards to accomplish basic skills with high success rates. For skill planning, we use Large Language Models to find the relationships between skills and build a skill graph in advance. When the agent is solving a task, our skill search algorithm walks on the skill graph and generates the proper skill plans for the agent. In experiments, our method accomplishes 24 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills. Our method outperforms baselines in most tasks by a large margin. The project's website and code can be found at https://sites.google.com/view/plan4mc.
翻译:我们研究了在 Minecraft 中构建多任务智能体。在没有人类演示的情况下,使用强化学习 (RL) 解决这个开放式环境下的长时间任务的样本效率极低。为了解决这个挑战,我们将解决 Minecraft 任务分解为学习基本技能和规划技能。我们在 Minecraft 中提出了三类细粒度基本技能,并使用具有内在奖励的 RL 来高成功率地完成基本技能。对于技能规划,我们使用大型语言模型找到技能之间的关系并提前构建一个技能图。当智能体解决任务时,我们的技能搜索算法在技能图上行走并为智能体生成合适的技能计划。在实验中,我们的方法完成了 24 种多样化的 Minecraft 任务,其中许多任务需要按照超过 10 个技能的顺序执行。我们的方法在大多数任务中表现比基线方法效果更好。该项目的网站和代码可以在 https://sites.google.com/view/plan4mc 找到。