We aim for domestic robots to operate indoor for long-term service. Under the object-level scene dynamics induced by human daily activities, a robot needs to robustly localize itself in the environment subject to scene uncertainties. Previous works have addressed visual-based localization in static environments, yet the object-level scene dynamics challenge existing methods on long-term deployment of the robot. This paper proposes SEmantic understANding Network (SeanNet) that enables robots to measure the similarity between two scenes on both visual and semantic aspects. We further develop a similarity-based localization method based on SeanNet for monitoring the progress of visual navigation tasks. In our experiments, we benchmarked SeanNet against baselines methods on scene similarity measures, as well as visual navigation performance once integrated with a visual navigator. We demonstrate that SeanNet outperforms all baseline methods, by robustly localizing the robot under object dynamics, thus reliably informing visual navigation about the task status.
翻译:我们的目标是让国内机器人在室内操作,以便长期服务。在人类日常活动引发的物体级场景动态下,机器人需要将自身在环境中稳健地定位,视景不确定情况而定。以前的工作涉及静态环境中的视觉定位,但物体级场景动态对长期部署机器人的现有方法提出了挑战。本文提议了Semantic 底层网络(SeanNet),使机器人能够测量视觉和语义两个方面的相似性。我们进一步开发了基于SeanNet的类似性基地定位方法,以监测视觉导航任务的进展。在我们的实验中,我们根据现场相似性测量的基准方法以及一旦与视觉导航仪结合的视觉导航性能,对SeanNet进行了基准基准度测量。我们证明,SeanNet通过在物体动态下稳健地定位机器人,从而可靠地告知任务状态的视觉导航,从而超越了所有基线方法。