Learning from demonstration (LfD) methods have shown promise for solving multi-step tasks; however, these approaches do not guarantee successful reproduction of the task given disturbances. In this work, we identify the roots of such a challenge as the failure of the learned continuous policy to satisfy the discrete plan implicit in the demonstration. By utilizing modes (rather than subgoals) as the discrete abstraction and motion policies with both mode invariance and goal reachability properties, we prove our learned continuous policy can simulate any discrete plan specified by a Linear Temporal Logic (LTL) formula. Consequently, the imitator is robust to both task- and motion-level disturbances and guaranteed to achieve task success. Project page: https://sites.google.com/view/ltl-ds
翻译:从示范(LfD)方法中学习的示范(LfD)方法显示出解决多步任务的前景;然而,这些方法并不能保证顺利复制遇到的扰动任务。在这项工作中,我们查明了这种挑战的根源,例如学习的连续政策未能满足演示中隐含的离散计划。通过使用模式(而不是次级目标)作为具有模式差异和可达性特性的离散抽象和运动政策,我们证明我们学习的连续政策可以模拟线性时空逻辑(LTL)公式规定的任何离散计划。因此,模拟器对任务和运动级的扰动都十分强大,并保证任务成功。项目网页:httpssites.google.com/view/ltl-ds。项目网页:https://sites.gle. com/view/ltl-ds。项目网页:https://sites.l/ltl-ds。