We consider the problem of object goal navigation in unseen environments. In our view, solving this problem requires learning of contextual semantic priors, a challenging endeavour given the spatial and semantic variability of indoor environments. Current methods learn to implicitly encode these priors through goal-oriented navigation policy functions operating on spatial representations that are limited to the agent's observable areas. In this work, we propose a novel framework that actively learns to generate semantic maps outside the field of view of the agent and leverages the uncertainty over the semantic classes in the unobserved areas to decide on long term goals. We demonstrate that through this spatial prediction strategy, we are able to learn semantic priors in scenes that can be leveraged in unknown environments. Additionally, we show how different objectives can be defined by balancing exploration with exploitation during searching for semantic targets. Our method is validated in the visually realistic environments offered by the Matterport3D dataset and show state of the art results on the object goal navigation task.
翻译:我们认为,要解决这一问题,需要学习背景语义学前科,鉴于室内环境的空间和语义变化性,这是一项具有挑战性的工作。目前的方法是通过限于代理人可观测地区的空间代表方式运作的面向目标的导航政策功能,对这些前科进行隐含的编码。在这项工作中,我们提出了一个新的框架,积极学习在代理人视野外生成语义图,并利用未观测地区的语义类不确定性来决定长期目标。我们通过这一空间预测战略证明,我们能够在可以在未知环境中加以利用的场景中学习语义学前科。此外,我们展示如何通过在搜寻语义目标过程中将探索与开发相平衡来界定不同的目标。我们的方法在Teleintport3D数据集提供的视觉现实环境中得到验证,并显示目标导航任务艺术结果的状态。