In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially compliant manner. However, the expert demonstration data used in existing methods is usually expensive and difficult to obtain. In this work, we consider the task of training an RL agent without employing the demonstration data, to achieve efficient and collision-free navigation in a crowded environment. To address the sparse reward navigation problem, we propose to incorporate the hindsight experience replay (HER) and curriculum learning (CL) techniques with RL to efficiently learn the optimal navigation policy in the dense crowd. The effectiveness of our method is validated in a simulated crowd-robot coexisting environment. The results demonstrate that our method can effectively learn human-aware navigation without requiring additional demonstration data.
翻译:近年来,对智能型机器人日益增长的需求正在推动发展移动式机器人导航算法,以便能够在密集人群中安全高效地运行。强化学习方法在解决连续决策问题时表现出了超强的能力。强化学习方法在解决连续决策问题时表现出了超强的能力,最近的工作探索了以社会兼容的方式学习导航政策的潜力。然而,在现行方法中使用的专家示范数据通常成本高昂,而且难以获得。在这项工作中,我们认为在不使用演示数据的情况下培训RL代理,以便在拥挤环境中实现高效和无碰撞的导航。为了解决微薄的奖励导航问题,我们提议与RL结合后视再玩(HER)和课程学习(CL)技术,以便在密集人群中有效学习最佳导航政策。我们的方法的有效性在模拟的人群机器人共存环境中得到验证。结果表明,我们的方法可以有效地学习人觉导航,而无需额外的演示数据。