Efficient planning in continuous state and action spaces is fundamentally hard, even when the transition model is deterministic and known. One way to alleviate this challenge is to perform bilevel planning with abstractions, where a high-level search for abstract plans is used to guide planning in the original transition space. Previous work has shown that when state abstractions in the form of symbolic predicates are hand-designed, operators and samplers for bilevel planning can be learned from demonstrations. In this work, we propose an algorithm for learning predicates from demonstrations, eliminating the need for manually specified state abstractions. Our key idea is to learn predicates by optimizing a surrogate objective that is tractable but faithful to our real efficient-planning objective. We use this surrogate objective in a hill-climbing search over predicate sets drawn from a grammar. Experimentally, we show across four robotic planning environments that our learned abstractions are able to quickly solve held-out tasks, outperforming six baselines. Code: https://tinyurl.com/predicators-release
翻译:连续状态和行动空间的有效规划根本是困难的,即使过渡模式是决定性的和已知的。缓解这一挑战的方法之一是执行带有抽象的双层规划,在最初的过渡空间中,对抽象计划进行高级别搜索,以指导规划。先前的工作表明,如果以象征性上游形式进行的状态抽取是手工设计的,从演示中可以学到双层规划的操作者和采样者。在这项工作中,我们提出了一个从演示中学习上游的算法,从而消除人工指定的状态抽取的需要。我们的关键想法是,通过优化一个可移动但忠实于我们真正高效规划目标的替代目标来学习上游。我们在从图示中抽取的上游图集上使用这一代用目标。我们实验性地展示了四个机器人规划环境,即我们学到的抽取能够快速解决搁置的任务,超过六个基线。代码:https://tinyurl.com/predictors-releasement 。