Bilevel planning, in which a high-level search over an abstraction of an environment is used to guide low-level decision making, is an effective approach to solving long-horizon tasks in continuous state and action spaces. Recent work has shown that action abstractions that enable such bilevel planning can be learned in the form of symbolic operators and neural samplers given symbolic predicates and demonstrations that achieve known goals. In this work, we show that existing approaches fall short in environments where actions tend to cause a large number of predicates to change. To address this issue, we propose to learn operators with ignore effects. The key idea motivating our approach is that modeling every observed change in the predicates is unnecessary; the only changes that need be modeled are those that are necessary for high-level search to achieve the specified goal. Experimentally, we show that our approach is able to learn operators with ignore effects across six hybrid robotic domains that enable an agent to solve novel variations of a task, with different initial states, goals, and numbers of objects, significantly more efficiently than several baselines.
翻译:在双层规划中,对一种环境的抽象性进行高层次搜索,用于指导低层次决策,这是在连续状态和行动空间中解决长期和横向任务的有效方法。最近的工作表明,可以象征性操作者和神经采样者的形式,以象征性的前提和示范形式,进行双层规划,实现已知目标。在这项工作中,我们表明,在行动往往导致大量上游变化的环境中,现有方法不足。为了解决这一问题,我们提议学习具有忽视效应的操作者。我们的方法的关键理念是,为所有观察到的上游变化建模是不必要的;唯一需要建模的变革是达到特定目标所需的高级搜索。我们实验性地表明,我们的方法能够学习在六个混合机器人领域产生忽视效应的操作者,这些效应使代理者能够解决任务的新变化,其初始状态、目标和数量大大超过几个基线。