This technical report introduces CyberLoc, an image-based visual localization pipeline for robust and accurate long-term pose estimation under challenging conditions. The proposed method comprises four modules connected in a sequence. First, a mapping module is applied to build accurate 3D maps of the scene, one map for each reference sequence if there exist multiple reference sequences under different conditions. Second, a single-image-based localization pipeline (retrieval--matching--PnP) is performed to estimate 6-DoF camera poses for each query image, one for each 3D map. Third, a consensus set maximization module is proposed to filter out outlier 6-DoF camera poses, and outputs one 6-DoF camera pose for a query. Finally, a robust pose refinement module is proposed to optimize 6-DoF query poses, taking candidate global 6-DoF camera poses and their corresponding global 2D-3D matches, sparse 2D-2D feature matches between consecutive query images and SLAM poses of the query sequence as input. Experiments on the 4seasons dataset show that our method achieves high accuracy and robustness. In particular, our approach wins the localization challenge of ECCV 2022 workshop on Map-based Localization for Autonomous Driving (MLAD-ECCV2022).
翻译:本技术报告介绍了CyberLoc, 这是一种基于图像的视觉定位管道,用于在具有挑战性的条件下进行稳健和准确的长期影响估计。拟议的方法由四个模块组成,按顺序连接了四个模块。首先,一个绘图模块用于绘制准确的三维场景地图,如果在不同条件下存在多个参考序列,则每个参考序列都有一张地图。第二,一个基于单一图像的本地化管道(Restrieval-匹配-PnP)用于估算每个查询图像的6-DoF相机,每张3D地图都有一台。第三,一个设定的共识最大化模块用于过滤外端6-DoF相机,并输出一个6-DoF相机供查询之用。最后,一个强大的配置改进模块用于优化6-DoF查询,将候选全球6-DoF相机的配置及其相应的全球2D-3D相匹配,连续查询图像与SLAM的查询序列之间稀疏漏2D-2D特征,作为输入。关于4Seson数据集的实验显示,我们的方法达到了很高的准确性和坚固度。特别是,我们为20-DRMLMAL的本地化的地图定位定位系统。