This paper studies how to improve the generalization performance and learning speed of the navigation agents trained with deep reinforcement learning (DRL). DRL exhibits huge potential in mapless navigation, but DRL agents performing well in training scenarios are found to perform poorly in unfamiliar real-world scenarios. In this work, we present the representation of LiDAR readings as a key factor behind agents' performance degradation and propose a simple but powerful input pre-processing (IP) approach to improve the agents' performance. As this approach uses adaptively parametric reciprocal functions to pre-process LiDAR readings, we refer to this approach as IPAPRec and its normalized version as IPAPRecN. IPAPRec/IPAPRecN can highlight important short-distance values and compress the range of less-important long-distance values in laser scans, which well addressed the issues induced by conventional representations of laser scans. Their high performance is validated by extensive simulation and real-world experiments. The results show that our methods can substantially improve agents' success rates and greatly reduce the training time compared to conventional methods.
翻译:本文研究如何提高受过深层加固学习(DRL)培训的导航剂的一般性能和学习速度。 DRL在无地图导航中具有巨大潜力,但DRL在培训情景方面表现良好,在不为人所知的现实情景中表现不佳。在这项工作中,我们将LIDAR读数的表述作为代理人性能退化的一个关键因素,并提出一种简单而有力的投入预处理(IP)方法来改进代理人的性能。由于这一方法在LIDAR前的读数中使用了适应性对等功能,我们称之为IPAPRec, 其普通版本为 IPAPRecN。 IPAPRec/IPAPRecN 能够突出重要的短距离值,并压缩激光扫描中不太重要的长距离值的范围,这很好地解决了激光扫描常规表现引起的问题。其高性能通过广泛的模拟和实际实验得到验证。结果显示,我们的方法可以大大提高代理人的成功率,并大大缩短培训时间与常规方法相比。