蒙特卡洛机器人路径规划 (Monte-Carlo Robot Path Planning)

Path planning is a crucial algorithmic approach for designing robot behaviors. Sampling-based approaches, like rapidly exploring random trees (RRTs) or probabilistic roadmaps, are prominent algorithmic solutions for path planning problems. Despite its exponential convergence rate, RRT can only find suboptimal paths. On the other hand, $\textrm{RRT}^*$, a widely-used extension to RRT, guarantees probabilistic completeness for finding optimal paths but suffers in practice from slow convergence in complex environments. Furthermore, real-world robotic environments are often partially observable or with poorly described dynamics, casting the application of $\textrm{RRT}^*$ in complex tasks suboptimal. This paper studies a novel algorithmic formulation of the popular Monte-Carlo tree search (MCTS) algorithm for robot path planning. Notably, we study Monte-Carlo Path Planning (MCPP) by analyzing and proving, on the one part, its exponential convergence rate to the optimal path in fully observable Markov decision processes (MDPs), and on the other part, its probabilistic completeness for finding feasible paths in partially observable MDPs (POMDPs) assuming limited distance observability (proof sketch). Our algorithmic contribution allows us to employ recently proposed variants of MCTS with different exploration strategies for robot path planning. Our experimental evaluations in simulated 2D and 3D environments with a 7 degrees of freedom (DOF) manipulator, as well as in a real-world robot path planning task, demonstrate the superiority of MCPP in POMDP tasks.

翻译：路径规划是设计机器人行为的关键算法方法。以抽样为基础的方法,如快速探索随机树( RRTs) 或概率性路线图,是路径规划问题的主要算法解决办法。尽管它具有指数趋同率, RRT只能找到亚最佳路径。另一方面, $\ textrm{RRRT $, 它被广泛推广到 RRT, 保证找到最佳路径的概率性完整性, 但却在复杂环境中的实际融合缓慢。此外, 真实世界机器人环境往往部分可见或动态描述不善, 在复杂任务中应用 $\ textrm{RRMT $ 。本文研究了流行的蒙特- Carlo 树搜索( MCTS) 路径图的新型算法公式。值得注意的是, 我们研究蒙特- Carlo 路径规划( MCPP PP ), 一方面通过分析和证明它与完全可观测的Markov 决策过程( MDPs) 的最佳路径的指数一致。另一方面, 其精确性完整性往往被部分观察到或描述, 在部分可观测的MDP 中找到可行的路径( PRODP), 作为我们 3 的实验性的模型的模型分析策略中,, 的模型中, 和我们的模型的模型的模型的模型的精确性(, 的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型,, 。