We present a novel outdoor navigation algorithm to generate stable and efficient actions to navigate a robot to the goal. We use a multi-stage training pipeline and show that our model produces policies that result in stable and reliable robot navigation on complex terrains. Based on the Proximal Policy Optimization (PPO) algorithm, we developed a novel method to achieve multiple capabilities for outdoor navigation tasks, namely: alleviating the robot's drifting, keeping the robot stable on bumpy terrains, avoiding climbing on hills with steep elevation changes, and collision avoidance. Our training process mitigates the reality(sim-to-real) gap by introducing more generalized environmental and robotic parameters and training with rich features of Lidar perception in the Unity simulator. We evaluate our method in both simulation and the real world with Clearpath Husky and Jackal. Additionally, we compare our method against the state-of-the-art approaches and show that in the real world it improves stability by at least 30.7% on uneven terrains, reduces drifting by 8.08%, and for high hills our trained policy keeps small changes of the elevation of the robot at each motion step by preventing the robot from moving on areas with high gradients.
翻译:我们提出了一个新的户外导航算法,以产生稳定而高效的行动,引导机器人走向目标。我们使用多阶段培训管道,并展示我们的模型所产生的政策能够导致在复杂地形上实现稳定和可靠的机器人导航。根据Proximal政策优化算法,我们开发了一种新颖的方法,以实现户外导航任务的多种能力,即:减缓机器人的漂移,使机器人稳定在崎岖的地形上,避免攀登高地,避免碰撞。我们的培训过程通过在Unity模拟器中引入更普遍的环境和机器人参数以及具有利达尔感的丰富特点的培训来缓解现实(即现实的)差距。我们用Clearpath Husky 和 Jackal 来评估我们在模拟和真实世界中的方法。此外,我们比较了我们的方法与最先进的方法,并表明在现实世界中,它至少改善了30.7%的不稳定地形上的稳定,减少了8.08%的漂移率,高山上我们经过训练的政策通过防止机器人在高梯度上移动,使机器人在每一个运动的脚步步步上都保持少量的高度变化。