We present a novel method for safely navigating a robot in unknown and uneven outdoor terrains. Our approach trains a novel Deep Reinforcement Learning (DRL)-based network with channel and spatial attention modules using a novel reward function to compute an attention map of the environment. The attention map identifies regions in the environment's elevation map with high elevation gradients where the robot could have reduced stability or even flip over. We transform this attention map into a 2D navigation cost-map, which encodes the planarity (level of flatness) of the terrain. Using the cost-map, we formulate a novel method for computing local least-cost waypoints leading to the robot's goal and integrate our approach with DWA-RL, a state-of-the-art navigation method. Our approach guarantees safe, locally least-cost paths and dynamically feasible robot velocities in highly uneven terrains. Our hybrid approach also leads to a low sim-to-real gap, which arises while training DRL networks. We observe an improvement in terms of success rate, the cumulative elevation gradient of the robot's trajectory, and the safety of the robot's velocity. We evaluate our method on a real Husky robot in highly uneven real-world terrains and demonstrate its benefits.
翻译:我们提出了在未知和不均匀的户外地形安全导航机器人的新颖方法。 我们的方法是培训一个新型的深强化学习(DRL)网络,以频道和空间关注模块为基础,使用新颖的奖赏功能来计算关注环境的地图。 关注地图确定了环境高地地图中具有高高高梯度梯度的区域, 机器人可以降低稳定性甚至翻转。 我们将这份关注地图转换为2D导航成本图, 该图将地形的规划( 平坦程度) 编码起来。 使用成本图, 我们制定了一个新的方法, 用于计算导致机器人目标的本地最低成本路径, 并将我们的方法与最新导航方法DWA-RL( DWA-RL)结合起来。 我们的方法保证了安全、 本地最低成本路径和动态可行的机器人速度, 在高度不均匀的地形中, 我们的混合方法还导致一个低标准到现实差距, 在培训DRL网络时, 我们观察到了成功率的改进, 机器人轨迹的累积升梯度梯度梯度, 在真实的轨道上, 和高度机器人的地形上展示了一个安全。