用于任务规划的基于知识的等级式POMDP (Knowledge-Based Hierarchical POMDPs for Task Planning)

The main goal in task planning is to build a sequence of actions that takes an agent from an initial state to a goal state. In robotics, this is particularly difficult because actions usually have several possible results, and sensors are prone to produce measurements with error. Partially observable Markov decision processes (POMDPs) are commonly employed, thanks to their capacity to model the uncertainty of actions that modify and monitor the state of a system. However, since solving a POMDP is computationally expensive, their usage becomes prohibitive for most robotic applications. In this paper, we propose a task planning architecture for service robotics. In the context of service robot design, we present a scheme to encode knowledge about the robot and its environment, that promotes the modularity and reuse of information. Also, we introduce a new recursive definition of a POMDP that enables our architecture to autonomously build a hierarchy of POMDPs, so that it can be used to generate and execute plans that solve the task at hand. Experimental results show that, in comparison to baseline methods, by following a recursive hierarchical approach the architecture is able to significantly reduce the planning time, while maintaining (or even improving) the robustness under several scenarios that vary in uncertainty and size.

翻译：任务规划的主要目标是建立一系列行动,从初始状态到目标状态的代理商。在机器人中,这特别困难,因为行动通常有几种可能的结果,传感器容易产生有误的测量结果。部分可见的Markov 决策程序(POMDPs)被普遍采用,因为它们有能力模拟改变和监测系统状态的行动的不确定性。然而,由于解决一个POMDP是计算成本高昂的,因此大多数机器人应用程序都无法使用。在本文中,我们提议了一个服务机器人的任务规划结构。在服务机器人设计方面,我们提出了一个计划,将有关机器人及其环境的知识编码,促进信息的模块化和再利用。此外,我们引入了一个新的POMDP的循环定义,使我们的架构能够自主地建立POMDP的等级结构,从而可以用来生成和执行解决手头任务的计划。实验结果表明,与基线方法相比,该架构可以大幅缩短规划时间,同时在几种情景下保持(或甚至改进)稳健的不确定性。