Efficient ObjectGoal navigation (ObjectNav) in novel environments requires an understanding of the spatial and semantic regularities in environment layouts. In this work, we present a straightforward method for learning these regularities by predicting the locations of unobserved objects from incomplete semantic maps. Our method differs from previous prediction-based navigation methods, such as frontier potential prediction or egocentric map completion, by directly predicting unseen targets while leveraging the global context from all previously explored areas. Our prediction model is lightweight and can be trained in a supervised manner using a relatively small amount of passively collected data. Once trained, the model can be incorporated into a modular pipeline for ObjectNav without the need for any reinforcement learning. We validate the effectiveness of our method on the HM3D and MP3D ObjectNav datasets. We find that it achieves the state-of-the-art on both datasets, despite not using any additional data for training.
翻译:在新的环境中,高效的物体目标导航(ObjectNav)需要了解环境布局的空间和语义规律。在这项工作中,我们提出了一个直接的方法,通过从不完整的语义图中预测未观测到的物体的位置来学习这些规律。我们的方法不同于以前的预测导航方法,例如前沿潜在预测或以自我为中心的地图完成,方法是直接预测不可见的目标,同时利用以前探索的所有区域的全球环境。我们的预测模型是轻量级的,可以使用相对较少的被动收集数据进行有监督的培训。一旦经过培训,该模型可以纳入OctalNav的模块管道,而不需要任何强化学习。我们验证了我们在HM3D和MP3D对象Nav数据集上的方法的有效性。我们发现,尽管没有使用任何额外数据进行训练,但是它达到了两个数据集的状态。