Object-based maps are relevant for scene understanding since they integrate geometric and semantic information of the environment, allowing autonomous robots to robustly localize and interact with on objects. In this paper, we address the task of constructing a metric-semantic map for the purpose of long-term object-based localization. We exploit 3D object detections from monocular RGB frames for both, the object-based map construction, and for globally localizing in the constructed map. To tailor the approach to a target environment, we propose an efficient way of generating 3D annotations to finetune the 3D object detection model. We evaluate our map construction in an office building, and test our long-term localization approach on challenging sequences recorded in the same environment over nine months. The experiments suggest that our approach is suitable for constructing metric-semantic maps, and that our localization approach is robust to long-term changes. Both, the mapping algorithm and the localization pipeline can run online on an onboard computer. We will release an open-source C++/ROS implementation of our approach.
翻译:摘要:基于对象的地图对于场景理解非常重要,因为它们集成了环境的几何和语义信息,允许自主机器人强大地定位和与对象交互。在本文中,我们解决了构建度量-语义地图实现长期基于对象定位的任务。我们利用单目RGB帧的3D对象检测来进行基于对象的地图构建和全局定位。为了适应目标环境,我们提出了一种有效的方法来生成3D标注以优化3D对象检测模型。我们在办公楼中评估了我们的地图构建,并在相同环境下长达九个月的具有挑战性的序列中测试了我们的长期定位方法。实验表明,我们的方法适用于构建度量-语义地图,并且我们的定位方法对于长期变化具有鲁棒性,地图构建算法和定位管道均可在机载计算机上在线运行。我们将发布我们的方法的开源C++/ROS实现。