We present a novel approach for image-goal navigation, where an agent navigates with a goal image rather than accurate target information, which is more challenging. Our goal is to decouple the learning of navigation goal planning, collision avoidance, and navigation ending prediction, which enables more concentrated learning of each part. This is realized by four different modules. The first module maintains an obstacle map during robot navigation. The second predicts a long-term goal on the real-time map periodically, which can thus convert an image-goal navigation task to several point-goal navigation tasks. To achieve these point-goal navigation tasks, the third module plans collision-free command sets for navigating to these long-term goals. The final module stops the robot properly near the goal image. The four modules are designed or maintained separately, which helps cut down the search time during navigation and improves the generalization to previously unseen real scenes. We evaluate the method in both a simulator and in the real world with a mobile robot. The results in real complex environments show that our method attains at least a $17\%$ increase in navigation success rate and a $23\%$ decrease in navigation collision rate over some state-of-the-art models.
翻译:我们为图像目标导航提出了一个新颖的方法,即一种带有目标图像而非准确目标信息的代理导航,这更具挑战性。我们的目标是将导航目标规划、避免碰撞和导航结束预测的学习脱钩,这样可以使每个部分的学习更加集中。这通过四个不同的模块实现。第一个模块在机器人导航期间保持一个障碍地图。第二个模块定期预测实时地图上的长期目标,从而可以将图像目标导航任务转换为若干点目标导航任务。为了实现这些点目标导航任务,第三个模块计划了通航这些长期目标的无碰撞指令。最后一个模块将机器人适当靠近目标图像。四个模块是分开设计或维护的,这有助于缩短导航期间的搜索时间,改进以往看不见的真实场景的概观。我们用移动机器人对模拟器和真实世界的方法进行评估。真实复杂的环境的结果显示,我们的方法在导航成功率和导航碰撞率方面至少达到17美元,比一些州模型降低23美元。