We introduce Flatlandia, a novel problem for visual localization of an image from object detections composed of two specific tasks: i) Coarse Map Localization: localizing a single image observing a set of objects in respect to a 2D map of object landmarks; ii) Fine-grained 3DoF Localization: estimating latitude, longitude, and orientation of the image within a 2D map. Solutions for these new tasks exploit the wide availability of open urban maps annotated with GPS locations of common objects (\eg via surveying or crowd-sourced). Such maps are also more storage-friendly than standard large-scale 3D models often used in visual localization while additionally being privacy-preserving. As existing datasets are unsuited for the proposed problem, we provide the Flatlandia dataset, designed for 3DoF visual localization in multiple urban settings and based on crowd-sourced data from five European cities. We use the Flatlandia dataset to validate the complexity of the proposed tasks.
翻译:我们提出了 Flatlandia,这是一个关于从对象检测中计算出单个图像的可视化定位问题,包含两个特定任务: i)粗略地图本地化:相对于对象地标的 2D 地图,定位观察到一组对象的单个图像;ii)精细 3DoF 本地化:在二维地图中估计图像的纬度,经度和方向。这些新任务的解决方案利用了开放的城市地图对公共对象的 GPS 位置进行注释(例如通过勘测或众包)。这些地图还比通常用于视觉定位的大型 3D 模型更节省存储空间,而且同时保护隐私。由于现有数据集不适用于所提出的问题,因此我们提供了 Flatlandia 数据集,该数据集是为多个城市环境下的 3DoF 可视化本地化而设计的,并基于欧洲五个城市的众包数据。我们使用 Flatlandia 数据集来验证所提出任务的复杂性。