We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robot's dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles' motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacle's heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33\% increase), number of dynamics constraint violations (up to 61\% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.
翻译:我们提出一种新的深加学习政策,以动态可行和空间意识的速度计算机器人在移动障碍物中行驶的机器人。我们的方法结合了动态窗口方法(DWA)在满足机器人动态限制方面的好处,以及能够很好地处理移动障碍物和行人的最先进的DRL导航方法。我们的提法实现了这些目标,将环境障碍的动作嵌入一个新的低维观测空间。它还利用一种新的奖励功能积极加强机器人离开障碍物方向导致碰撞次数明显减少的速度。我们用现实的3D模拟环境来评估我们的方法,用挑战密集室内情景的真正差异驱动机器人与行人进行对比。我们将我们的方法与最先进的避免碰撞方法进行比较,并观察成功率(高达33 ⁇ )方面、动力限制的次数(减少到61 ⁇ )和平稳程度方面的重大改进。我们还进行对比研究,以突出我们观测空间配置和奖励结构的优点。