Fiducial markers can encode rich information about the environment and can aid Visual SLAM (VSLAM) approaches in reconstructing maps with practical semantic information. Current marker-based VSLAM approaches mainly utilize markers for improving feature detections in low-feature environments and/or for incorporating loop closure constraints, generating only low-level geometric maps of the environment prone to inaccuracies in complex environments. To bridge this gap, this paper presents a VSLAM approach utilizing a monocular camera along with fiducial markers to generate hierarchical representations of the environment while improving the camera pose estimate. The proposed approach detects semantic entities from the surroundings, including walls, corridors, and rooms encoded within markers, and appropriately adds topological constraints among them. Experimental results on a real-world dataset collected with a robot demonstrate that the proposed approach outperforms a traditional marker-based VSLAM baseline in terms of accuracy, given the addition of new constraints while creating enhanced map representations. Furthermore, it shows satisfactory results when comparing the reconstructed map quality to the one reconstructed using a LiDAR SLAM approach.
翻译:为了缩小这一差距,本文件介绍了VSLAM方法,利用单镜相机和纤维标记来生成环境等级表,同时改进摄像师的构成估计; 拟议的方法探测周围的语义实体,包括墙壁、走廊和在标记内编码的室内,并适当增加其中的地形限制; 与机器人一起收集的现实世界数据集的实验结果表明,拟议的方法在准确性方面超过了传统的基于标记的VSLAM基线,因为增加了新的限制因素,同时增加了更多的地图表示; 此外,在将重新制作的地图质量与使用激光雷达 SLAM 方法重建的地图质量进行比较时,它显示了令人满意的结果。</s>