Learning policies that effectively utilize language instructions in complex, multi-task environments is an important problem in imitation learning. While it is possible to condition on the entire language instruction directly, such an approach could suffer from generalization issues. To encode complex instructions into skills that can generalize to unseen instructions, we propose Learning Interpretable Skill Abstractions (LISA), a hierarchical imitation learning framework that can learn diverse, interpretable skills from language-conditioned demonstrations. LISA uses vector quantization to learn discrete skill codes that are highly correlated with language instructions and the behavior of the learned policy. In navigation and robotic manipulation environments, LISA outperforms a strong non-hierarchical baseline in the low data regime and is able to compose learned skills to solve tasks containing unseen long-range instructions. Our method demonstrates a more natural way to condition on language in sequential decision-making problems and achieve interpretable and controllable behavior with the learned skills.
翻译:在复杂、多任务环境中有效利用语言指导的学习政策是模仿学习中的一个重要问题。虽然有可能直接以整个语言教学为条件,但这种做法可能会受到一般化问题的影响。为了将复杂的教学纳入能够向无形指令推广的技能中,我们提议将学习解释性技能文摘(LISA),这是一个等级化的学习框架,能够从语言条件的演示中学习多样、可解释的技能。LISA使用矢量量化来学习与语言指令和所学政策的行为高度相关的离散技能守则。在导航和机器人操作环境中,LISA超越了低数据体系中强大的非等级基线,能够将所学技能转化为解决包含隐性远程指令的任务。我们的方法更自然地展示了一种在顺序决策问题上以语言为条件的条件,用所学技能实现可解释和可控制的行为的方法。