Understanding the geometric relationships between objects in a scene is a core capability in enabling both humans and autonomous agents to navigate in new environments. A sparse, unified representation of the scene topology will allow agents to act efficiently to move through their environment, communicate the environment state with others, and utilize the representation for diverse downstream tasks. To this end, we propose a method to train an autonomous agent to learn to accumulate a 3D scene graph representation of its environment by simultaneously learning to navigate through said environment. We demonstrate that our approach, GraphMapper, enables the learning of effective navigation policies through fewer interactions with the environment than vision-based systems alone. Further, we show that GraphMapper can act as a modular scene encoder to operate alongside existing Learning-based solutions to not only increase navigational efficiency but also generate intermediate scene representations that are useful for other future tasks.
翻译:了解场景中天体之间的几何关系是使人类和自主物剂都能在新环境中航行的核心能力。 分散、统一的场景地形表层代表将使物剂能够高效地在环境中移动,与他人交流环境状况,并使用代表物执行各种下游任务。 为此,我们提议了一种方法,培训一个自主物剂,通过同时学习如何在所述环境中航行来学习如何积累其环境的三维景象图。我们证明,我们的方法,即GreaphMapper,能够通过比基于愿景的系统本身较少的与环境互动来学习有效的导航政策。此外,我们表明,GreaphMapper可以作为一个模块化物色器,与现有的基于学习的解决方案一起运作,不仅提高航行效率,而且产生对未来其他任务有用的中间场面表。