Accurate perception of objects in the environment is important for improving the scene understanding capability of SLAM systems. In robotic and augmented reality applications, object maps with semantic and metric information show attractive advantages. In this paper, we present RO-MAP, a novel multi-object mapping pipeline that does not rely on 3D priors. Given only monocular input, we use neural radiance fields to represent objects and couple them with a lightweight object SLAM based on multi-view geometry, to simultaneously localize objects and implicitly learn their dense geometry. We create separate implicit models for each detected object and train them dynamically and in parallel as new observations are added. Experiments on synthetic and real-world datasets demonstrate that our method can generate semantic object map with shape reconstruction, and be competitive with offline methods while achieving real-time performance (25Hz). The code and dataset will be available at: https://github.com/XiaoHan-Git/RO-MAP
翻译:精准的环境感知对于提高SLAM系统的场景理解能力非常重要。在机器人和增强现实应用中,具有语义和度量信息的目标地图具有显著的优势。在本文中,我们提出了RO-MAP,一种不依赖于3D先验知识的新型多目标建图流程。我们使用神经辐射场来表示目标,并将其与基于多视角几何的轻量级目标SLAM相结合,同时定位目标并隐式学习其密集几何。我们为每个检测到的目标创建单独的隐式模型,并在添加新观测数据时进行动态和并行训练。在合成和真实数据集上的实验证明,我们的方法可以生成具有形状重建的语义目标地图,并具有在线方法的竞争能力,并实现了实时25Hz的性能。代码和数据集将可在https://github.com/XiaoHan-Git/RO-MAP上获得。