This paper studies how to improve the generalization performance and learning speed of the navigation agents trained with deep reinforcement learning (DRL). Although DRL exhibits huge potential in robot mapless navigation, DRL agents performing well in training scenarios are often found to perform poorly in unfamiliar scenarios. In this work, we propose that the representation of LiDAR readings is a key factor behind the degradation of agents' performance and present a powerful input pre-processing (IP) approach to address this issue. As this approach uses adaptively parametric reciprocal functions to pre-process LiDAR readings, we refer to this approach as IPAPRec and its normalized version as IPAPRecN. IPAPRec/IPAPRecN can highlight important short-distance values and compress the range of less-important long-distance values in laser scans, which well address the issues induced by conventional representations of laser scans. Their high performance was validated by extensive simulation and real-world experiments. The results show that our methods can substantially improve navigation agents' generalization performance and greatly reduce the training time compared to conventional methods.
翻译:本文研究如何提高经过深层加固学习(DRL)培训的导航剂的一般性能和学习速度。虽然DRL在机器人无地图导航中具有巨大潜力,但在不熟悉的情景下,在培训情景中表现良好的DRL代理物往往表现不佳。在这项工作中,我们建议LIDAR读数的表述是代理物性能退化的关键因素,并提出了解决这一问题的强大的输入前处理方法。由于这一方法对LiDAR前处理过程使用适应性对等功能,因此我们将这一方法称为IPAPRec, 其普通版本为 IPAPRecN。 IPAPRec/IPAPRecN 能够突出重要的短距离值,并压缩激光扫描中不太重要的长距离值的范围,这很好地解决了激光扫描的常规表现引起的问题。其高性能通过广泛的模拟和实际实验得到了验证。结果显示,我们的方法可以大大改进导航代理人的普及性能,并大大缩短培训时间与常规方法相比。