Reliable navigation systems have a wide range of applications in robotics and autonomous driving. Current approaches employ an open-loop process that converts sensor inputs directly into actions. However, these open-loop schemes are challenging to handle complex and dynamic real-world scenarios due to their poor generalization. Imitating human navigation, we add a reasoning process to convert actions back to internal latent states, forming a two-stage closed loop of perception, decision-making, and reasoning. Firstly, VAE-Enhanced Demonstration Learning endows the model with the understanding of basic navigation rules. Then, two dual processes in RL-Enhanced Interaction Learning generate reward feedback for each other and collectively enhance obstacle avoidance capability. The reasoning model can substantially promote generalization and robustness, and facilitate the deployment of the algorithm to real-world robots without elaborate transfers. Experiments show our method is more adaptable to novel scenarios compared with state-of-the-art approaches.
翻译:可靠的导航系统在机器人和自主驾驶方面有着广泛的应用。 目前的方法采用开放环路程序,将传感器输入直接转换为行动。 但是,这些开放环计划由于一般化程度差,在处理复杂和动态的现实世界情景时具有挑战性。 模拟人类导航,我们增加一个推理程序,将行动转换回内部潜伏状态,形成一个两阶段的闭路认知、决策和推理循环。 首先,VAE-Enhanced演示学习将基本导航规则的理解赋予模型。 然后,RL-Enhanced互动学习的两个双重程序相互产生奖励反馈,共同增强避免障碍的能力。 推理模型可以极大地促进普及和稳健性,并便利将算法运用到现实世界机器人,而无需复杂的传输。 实验显示,我们的方法比最先进的方法更适应于新的情景。