基于深度强化学习的无地图人群导航，利用移动群体的感知风险进行移动机器人路径规划 (Deep Reinforcement Learning-Based Mapless Crowd Navigation with Perceived Risk of the Moving Crowd for Mobile Robots)

Classical map-based navigation methods are commonly used for robot navigation, but they often struggle in crowded environments due to the Frozen Robot Problem (FRP). Deep reinforcement learning-based methods address the FRP problem, however, suffer from the issues of generalization and scalability. To overcome these challenges, we propose a method that uses Collision Probability (CP) to help the robot navigate safely through crowds. The inclusion of CP in the observation space gives the robot a sense of the level of danger of the moving crowd. The robot will navigate through the crowd when it appears safe but will take a detour when the crowd is moving aggressively. By focusing on the most dangerous obstacle, the robot will not be confused when the crowd density is high, ensuring scalability of the model. Our approach was developed using deep reinforcement learning (DRL) and trained using the Gazebo simulator in a non cooperative crowd environment with obstacles moving at randomized speeds and directions. We then evaluated our model on four different crowd-behavior scenarios with varying densities of crowds. The results shown that our method achieved a 100% success rate in all test settings. We compared our approach with a current state-of-the-art DRLbased approach, and our approach has performed significantly better. Importantly, our method is highly generalizable and requires no fine-tuning after being trained once. We further demonstrated the crowd navigation capability of our model in real-world tests.

翻译：经典的基于地图的导航方法通常用于机器人导航，但它们在拥挤环境中往往会遇到“冻结机器人问题”（FRP）。基于深度强化学习的方法解决了FRP问题，但是存在泛化和可扩展性问题。为了克服这些挑战，我们提出了一种方法，利用碰撞概率（CP）帮助机器人安全地穿过人群。将CP包含在观察空间中，使机器人能感知到群集运动的危险程度。当似乎安全时，机器人将穿过人群，但当人群运动激烈时，它将掉头而行。通过侧重于最危险的障碍物，当人群密度高时，机器人不会混淆，确保模型的可扩展性。我们的方法使用深度强化学习（DRL）开发，并使用Gazebo模拟器在非合作人群环境中训练，包括随机速度和方向的障碍物移动。然后，我们在四个不同的人群行为场景中评估了我们的模型，其中人群密度各不相同。结果表明，我们的方法在所有测试设置中都实现了100％的成功率。我们将我们的方法与当前最先进的基于DRL的方法进行了比较，并且我们的方法表现明显更好。重要的是，我们的方法高度通用，并且经过一次训练后不需要进行微调。我们进一步在真实环境中展示了我们的模型的人群导航能力。