Mobile robots are often tasked with repeatedly navigating through an environment whose traversability changes over time. These changes may exhibit some hidden structure, which can be learned. Many studies consider reactive algorithms for online planning, however, these algorithms do not take advantage of the past executions of the navigation task for future tasks. In this paper, we formalize the problem of minimizing the total expected cost to perform multiple start-to-goal navigation tasks on a roadmap by introducing the Learned Reactive Planning Problem. We propose a method that captures information from past executions to learn a motion policy to handle obstacles that the robot has seen before. We propose the LAMP framework, which integrates the generated motion policy with an existing navigation stack. Finally, an extensive set of experiments in simulated and real-world environments show that the proposed method outperforms the state-of-the-art algorithms by 10% to 40% in terms of expected time to travel from start to goal. We also evaluate the robustness of the proposed method in the presence of localization and mapping errors on a real robot.
翻译:移动机器人通常被赋予反复在一段具有可穿越性随时间的变化而变化的环境中航行的任务。 这些变化可能展示出一些隐藏的结构。 许多研究考虑了在线规划的被动算法, 但是, 这些算法并没有利用过去对未来任务执行的导航任务。 在本文件中, 我们正式确定了在路线图上执行多条起始到目标导航任务的预期总成本问题, 引入了“ 学习性反动规划问题 ” 。 我们提出了一个方法, 收集过去处决中的信息, 以学习处理机器人以前所看到的障碍的动作政策。 我们提出了LAMP框架, 该框架将产生的运动政策与现有的导航堆结合起来。 最后, 一系列在模拟和现实世界环境中进行的实验显示, 拟议的方法在从开始到目标的预期时间方面比最新算法高出10%到40%。 我们还评估了拟议方法在真实机器人上存在本地化和绘图错误时的稳健性。