This paper presents a Deep Reinforcement Learning based navigation approach in which we define the occupancy observations as heuristic evaluations of motion primitives, rather than using raw sensor data. Our method enables fast mapping of the occupancy data, generated by multi-sensor fusion, into trajectory values in 3D workspace. The computationally efficient trajectory evaluation allows dense sampling of the action space. We utilize our occupancy observations in different data structures to analyze their effects on both training process and navigation performance. We train and test our methodology on two different robots within challenging physics-based simulation environments including static and dynamic obstacles. We benchmark our occupancy representations with other conventional data structures from state-of-the-art methods. The trained navigation policies are also validated successfully with physical robots in dynamic environments. The results show that our method not only decreases the required training time but also improves the navigation performance as compared to other occupancy representations. The open-source implementation of our work and all related info are available at \url{https://github.com/RIVeR-Lab/tentabot}.
翻译:本文介绍了基于深强化学习的导航方法,我们据此将占用观测定义为对运动原始物的超常评估,而不是使用原始传感器数据。我们的方法使得能够将多传感器聚合产生的占用数据快速绘制成3D工作空间的轨迹值。计算高效的轨迹评估使得可以对行动空间进行密集抽样。我们在不同的数据结构中利用我们的占用观测来分析其对培训过程和导航性能的影响。我们在具有挑战性的物理模拟环境中对两种不同的机器人进行我们的方法的培训和测试,包括静态和动态障碍。我们用最先进的方法将我们的占用表现与其他常规数据结构进行基准。经过培训的导航政策也成功地与动态环境中的物理机器人进行了验证。结果显示,我们的方法不仅减少了所需的培训时间,而且改善了导航性能,与其他占用情况相比,我们的工作和所有相关信息的开放源执行情况可在\url{https://github.com/RIVR-Lab/tentabotout}。