This paper proposes a novel framework for real-time localization and egomotion tracking of a vehicle in a reference map. The core idea is to map the semantic objects observed by the vehicle and register them to their corresponding objects in the reference map. While several recent works have leveraged semantic information for cross-view localization, the main contribution of this work is a view-invariant formulation that makes the approach directly applicable to any viewpoint configuration for which objects are detectable. Another distinctive feature is robustness to changes in the environment/objects due to a data association scheme suited for extreme outlier regimes (e.g., 90% association outliers). To demonstrate our framework, we consider an example of localizing a ground vehicle in a reference object map using only cars as objects. While only a stereo camera is used for the ground vehicle, we consider reference maps constructed a priori from ground viewpoints using stereo cameras and Lidar scans, and georeferenced aerial images captured at a different date to demonstrate the framework's robustness to different modalities, viewpoints, and environment changes. Evaluations on the KITTI dataset show that over a 3.7 km trajectory, localization occurs in 36 sec and is followed by real-time egomotion tracking with an average position error of 8.5 m in a Lidar reference map, and on an aerial object map where 77% of objects are outliers, localization is achieved in 71 sec with an average position error of 7.9 m.
翻译:本文提出了一个用于在参考地图中实时定位和自我感动跟踪飞行器的新框架。 核心理念是绘制飞行器所观测的语义对象的地图并在参考地图中将其注册到相应的对象。 虽然最近的一些著作利用了语义信息进行交叉视图本地化, 但这项工作的主要贡献是视觉变化式的配方, 使该方法直接适用于任何可以检测对象的视图配置。 另一个显著的特征是, 环境/ 目标因适合极端偏差制度的数据联系计划( 例如, 90%的关联离子)而变化的稳健性。 为了展示我们的框架, 我们考虑在参考对象地图中将地面飞行器的语义对象进行本地定位, 仅将汽车作为对象。 虽然对地面飞行器只使用了立体相机, 我们考虑的是使用立体相机和利达尔扫描, 在不同日期拍摄的地理参照航空图像, 以显示框架对不同模式、 观点和环境变化的稳健性。 KITTI 对象数据设置的定位显示我们的框架位置, 在37公里的平均方向上, 直径的直径定位显示, 直径36 的直径直径的直径直径的直径直径, 直径直径, 直径直径直方的直径直径直的直的直距是直距方向的直的直的直的直的直距方向的定位定位是直。