This study presents a new methodology for learning-based motion planning for autonomous exploration using aerial robots. Through the reinforcement learning method of learning through trial and error, the action policy is derived that can guide autonomous exploration of underground and tunnel environments. A new Markov decision process state is designed to learn the robot's action policy by using simulation only, and the results are applied to the real-world environment without further learning. Reduce the need for the precision map in grid-based path planner and achieve map-less navigation. The proposed method can have a path with less computing cost than the grid-based planner but has similar performance. The trained action policy is broadly evaluated in both simulation and field trials related to autonomous exploration of underground mines or indoor spaces.
翻译:这项研究为利用航空机器人进行自主探索的基于学习的运动规划提供了新方法。通过强化学习方法,通过试验和错误学习,该行动政策可以指导对地下和隧道环境的自主探索。一个新的Markov决策程序状态旨在只使用模拟来学习机器人的行动政策,结果应用到现实世界环境中而无需进一步学习。降低基于电网的路径规划师对精确地图的需求,实现无地图的导航。拟议方法的计算成本可能低于基于电网的规划师,但具有类似的性能。经过培训的行动政策在与自主探索地下矿区或室内空间有关的模拟和实地试验中得到广泛的评价。