Bilevel planning, in which a high-level search over an abstraction of an environment is used to guide low-level decision-making, is an effective approach to solving long-horizon tasks in continuous state and action spaces. Recent work has shown how to enable such bilevel planning by learning action and transition model abstractions in the form of symbolic operators and neural samplers. In this work, we show that existing symbolic operator learning approaches fall short in many natural environments where agent actions tend to cause a large number of irrelevant propositions to change. This is primarily because they attempt to learn operators that optimize the prediction error with respect to observed changes in the propositions. To overcome this issue, we propose to learn operators that only model changes necessary for abstract planning to achieve the specified goal. Experimentally, we show that our approach learns operators that lead to efficient planning across 10 different hybrid robotics domains, including 4 from the challenging BEHAVIOR-100 benchmark, with generalization to novel initial states, goals, and objects.
翻译:在双层规划中,对一种环境的抽象性进行高层次的搜索,用于指导低层次决策,这是在连续的状态和行动空间中解决长期横向任务的有效方法。最近的工作表明,如何通过学习行动和以象征性操作者和神经采样者为形式的过渡模型抽象化使这种双层规划成为可能。在这项工作中,我们表明,在许多自然环境中,现有的象征性操作者学习方法不尽如人意,在这种环境中,代理行为往往导致大量不相关的主张改变。这主要是因为他们试图学习操作者如何在观察到的假设变化方面优化预测错误。为了克服这一问题,我们提议学习操作者,只有抽象规划所需的模式变化才能实现特定目标。我们实验性地表明,我们的方法学习了操作者如何导致在10个不同的混合机器人领域进行有效规划,包括4个具有挑战性的BEHAVIOR-100基准,向新的初始状态、目标和目的进行概括。