We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals. At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states. RSGs can be learned from weakly-annotated data, in the form of unsegmented demonstration trajectories, paired with abstract task descriptions, which are composed of terms initially unknown to the agent (e.g., collect-wood then craft-boat then go-across-river). Our framework also discovers dependencies between RSGs, e.g., the task collect-wood is a helpful subgoal for the task craft-boat. Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT, by setting helpful subgoals as waypoints to the planner, which significantly improves performance-time efficiency.
翻译:我们提出了一个框架,用于学习有用的次级目标,支持有效的长期规划,以实现新的目标。我们框架的核心是一系列合理的次级目标(RSGs),这些次级目标基本上是环境国家的二进制分类器。从附带说明的薄弱数据中可以学习RSGs,其形式为无分层示范轨迹,并配有抽象的任务描述,这些描述由代理人最初不知道的术语组成(例如,采集木材,然后收集手工艺船,然后跨河)。我们的框架还发现RSGs之间的依赖性,例如,任务采集木是任务操作船的一个有用的次级目标。根据目标描述,所学到的次级目标和衍生的依附性促进了现成的规划算法,如A* 和RRT等,将有用的次级目标作为规划员的路径,大大提高了业绩时间效率。</s>