In this paper, we address the problem of stochastic motion planning under partial observability, more specifically, how to navigate a mobile robot equipped with continuous range sensors such as LIDAR. In contrast to many existing robotic motion planning methods, we explicitly consider the uncertainty of the robot state by modeling the system as a POMDP. Recent work on general purpose POMDP solvers is typically limited to discrete observation spaces, and does not readily apply to the proposed problem due to the continuous measurements from LIDAR. In this work, we build upon an existing Monte Carlo Tree Search method, POMCP, and propose a new algorithm POMCP++. Our algorithm can handle continuous observation spaces with a novel measurement selection strategy. The POMCP++ algorithm overcomes over-optimism in the value estimation of a rollout policy by removing the implicit perfect state assumption at the rollout phase. We validate POMCP++ in theory by proving it is a Monte Carlo Tree Search algorithm. Through comparisons with other methods that can also be applied to the proposed problem, we show that POMCP++ yields significantly higher success rate and total reward.
翻译:在本文中,我们根据部分可观察性,更具体地说,在如何导航装有LIDAR等连续测距传感器的移动机器人时,我们处理随机运动规划问题。与许多现有的机器人运动规划方法不同,我们通过模拟系统作为POMDP来明确考虑机器人状态的不确定性。关于一般用途的POMDP解答器的最近工作通常局限于离散观测空间,由于LIDAR的连续测量,因此不易适用于拟议的问题。在这项工作中,我们利用现有的蒙特卡洛树搜索法(POMCP),并提议一种新的算法POMCP++。我们的算法可以用新的计量选择战略处理连续观测空间。POMCP+这一算法克服了推出政策价值估计中的过度乐观现象,在推出阶段消除了隐含的完美状态假设。我们通过证明它是一种蒙特卡洛树搜索算法,在理论上验证POMCP++。我们通过与其他方法进行比较,也可以适用于拟议的问题,表明POMCP++产生更高的成功率和总报酬。