Existing spatial localization techniques for autonomous vehicles mostly use a pre-built 3D-HD map, often constructed using a survey-grade 3D mapping vehicle, which is not only expensive but also laborious. This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy, providing a cheaper and more practical way for localization. While the utilization of satellite imagery for cross-view localization is an established concept, the conventional methodology focuses primarily on image retrieval. This paper introduces a novel approach to cross-view localization that departs from the conventional image retrieval method. Specifically, our method develops (1) a Geometric-align Feature Extractor (GaFE) that leverages measured 3D points to bridge the geometric gap between ground and overhead views, (2) a Pose Aware Branch (PAB) adopting a triplet loss to encourage pose-aware feature extraction, and (3) a Recursive Pose Refine Branch (RPRB) using the Levenberg-Marquardt (LM) algorithm to align the initial pose towards the true vehicle pose iteratively. Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view. The results demonstrate the superiority of our method in cross-view localization with median spatial and angular errors within $1$ meter and $1^\circ$, respectively.
翻译:现有的自动驾驶车辆空间定位技术大多使用预先构建的3D-HD地图,通常使用调查级别的3D映射车辆构建,这不仅昂贵而且费力。本文将显示,通过使用现成的高清卫星图像作为即用地图,我们能够实现交叉视角车辆定位并达到令人满意的准确度,提供了一种更便宜、更实用的定位方式。虽然利用卫星图像进行交叉视角定位是一个已经建立的概念,但传统的方法主要关注图像检索。本文引入了一种新的交叉视角定位方法,不同于传统的图像检索方法。具体而言,我们的方法开发了(1)一种几何对齐特征提取器(GaFE),利用测量的3D点来弥合地面和高空视图之间的几何差距,(2)一个姿态感知分支(PAB),采用三元组损失来鼓励姿态感知特征提取,和(3)一个递归姿态细化分支(RPRB),使用Levenberg-Marquardt(LM)算法对初始姿态进行迭代地对齐到真实车辆姿态。我们的方法在KITTI和Ford Multi-AV Seasonal数据集作为地面视图以及Google Maps作为卫星视图上进行验证。结果显示,我们的方法在交叉视角定位中表现优越,其中中位空间误差和角度误差均达到1米和1度以下。