This work proposes a new framework for a socially-aware dynamic local planner in crowded environments by building on the recently proposed Trajectory-ranked Maximum Entropy Deep Inverse Reinforcement Learning (T-MEDIRL). To address the social navigation problem, our multi-modal learning planner explicitly considers social interaction factors, as well as social-awareness factors into T-MEDIRL pipeline to learn a reward function from human demonstrations. Moreover, we propose a novel trajectory ranking score using the sudden velocity change of pedestrians around the robot to address the sub-optimality in human demonstrations. Our evaluation shows that this method can successfully make a robot navigate in a crowded social environment and outperforms the state-of-art social navigation methods in terms of the success rate, navigation time, and invasion rate.
翻译:这项工作为在拥挤环境中的社会觉悟动态地方规划员提出了一个新的框架,以最近提议的轨迹排行榜排行榜排行榜最高级反向强化学习(T-MEDIRL)为基础。 为了解决社会导航问题,我们的多模式学习规划员将社会互动因素以及社会认识因素明确纳入T-MEDIRL管道,以学习人类示威的奖励功能。此外,我们提出一个新的轨迹评分,利用机器人周围行人突然高速变化来应对人类示威的次优性。我们的评估表明,这一方法可以成功地在拥挤的社会环境中进行机器人导航,在成功率、航行时间和入侵率方面超过最先进的社会导航方法。