Humans can robustly localize themselves without a map after they get lost following prominent visual cues or landmarks. In this work, we aim at endowing autonomous agents the same ability. Such ability is important in robotics applications yet very challenging when an agent is exposed to partially calibrated environments, where camera images with accurate 6 Degree-of-Freedom pose labels only cover part of the scene. To address the above challenge, we explore using Reinforcement Learning to search for a policy to generate intelligent motions so as to actively localize the agent given visual information in partially calibrated environments. Our core contribution is to formulate the active visual localization problem as a Partially Observable Markov Decision Process and propose an algorithmic framework based on Deep Reinforcement Learning to solve it. We further propose an indoor scene dataset ACR-6, which consists of both synthetic and real data and simulates challenging scenarios for active visual localization. We benchmark our algorithm against handcrafted baselines for localization and demonstrate that our approach significantly outperforms them on localization success rate.
翻译:人类在通过显眼的视觉提示或地标消失后,可以在没有地图的情况下强有力地自我定位。 在这项工作中,我们的目标是赋予自主代理同样的能力。 在机器人应用中,这种能力很重要,但是当一个代理接触部分校准的环境时,这种能力却非常具有挑战性,在这种环境中,精确的6度自由度标签的相机图像只覆盖场景的一部分。为了应对上述挑战,我们探索利用加强学习来寻找政策,以产生智能动作,从而在部分校准的环境中积极将具有视觉信息的代理方本地化。我们的核心贡献是将积极的视觉本地化问题发展成一个部分可观测的Markov决策程序,并提出一个基于深强化学习的算法框架来解决它。我们进一步提议一个室内场景数据集ACR-6,它由合成的和真实数据组成,并模拟具有挑战性的情景,以便进行积极的视觉本地化。我们用手工制作的本地化基线来衡量我们的算法,并表明我们的方法大大优于本地化的成功率。