We consider the problems of exploration and point-goal navigation in previously unseen environments, where the spatial complexity of indoor scenes and partial observability constitute these tasks challenging. We argue that learning occupancy priors over indoor maps provides significant advantages towards addressing these problems. To this end, we present a novel planning framework that first learns to generate occupancy maps beyond the field-of-view of the agent, and second leverages the model uncertainty over the generated areas to formulate path selection policies for each task of interest. For point-goal navigation the policy chooses paths with an upper confidence bound policy for efficient and traversable paths, while for exploration the policy maximizes model uncertainty over candidate paths. We perform experiments in the visually realistic environments of Matterport3D using the Habitat simulator and demonstrate: 1) Improved results on exploration and map quality metrics over competitive methods, and 2) The effectiveness of our planning module when paired with the state-of-the-art DD-PPO method for the point-goal navigation task.
翻译:我们考虑了在以前不为人知的环境中进行探索和定点导航的问题,在这种环境中,室内场景的空间复杂性和部分可观察性构成了这些具有挑战性的任务。我们争辩说,在室内地图上学习占用经验是解决这些问题的重大优势。为此目的,我们提出了一个新的规划框架,首先学会制作代理人实地外的占用图,其次是利用生成区域的模型不确定性,为每项有关任务制定路径选择政策。关于定点导航,该政策选择的路径带有高度信任的高效和可走路径的政策,而该政策则尽量扩大候选路径上的模型不确定性。我们利用生境模拟器在Tealport3D的视觉现实环境中进行实验,并演示:(1) 利用竞争方法改进勘探结果和地图质量指标,(2) 在与定向导航任务的最新DD-PPO方法相结合时,我们的规划模块的有效性。