Autonomous driving systems require a good understanding of surrounding environments, including moving obstacles and static High-Definition (HD) semantic map elements. Existing methods approach the semantic map problem by offline manual annotation, which suffers from serious scalability issues. Recent learning-based methods produce dense rasterized segmentation predictions to construct maps. However, these predictions do not include instance information of individual map elements and require heuristic post-processing to obtain vectorized maps. To tackle these challenges, we introduce an end-to-end vectorized HD map learning pipeline, termed VectorMapNet. VectorMapNet takes onboard sensor observations and predicts a sparse set of polylines in the bird's-eye view. This pipeline can explicitly model the spatial relation between map elements and generate vectorized maps that are friendly to downstream autonomous driving tasks. Extensive experiments show that VectorMapNet achieve strong map learning performance on both nuScenes and Argoverse2 dataset, surpassing previous state-of-the-art methods by 14.2 mAP and 14.6mAP. Qualitatively, we also show that VectorMapNet is capable of generating comprehensive maps and capturing more fine-grained details of road geometry. To the best of our knowledge, VectorMapNet is the first work designed towards end-to-end vectorized map learning from onboard observations. Our project website is available at https://tsinghua-mars-lab.github.io/vectormapnet/.
翻译:自主驾驶系统需要很好地了解周围环境,包括移动障碍物和静态高清(HD)语义地图元素。现有方法通过离线手动注释来处理语义地图问题,这种方法存在严重的可扩展性问题。最近的基于学习的方法通过生成密集的光栅化分割预测来构建地图。然而,这些预测不包括个别地图元素的实例信息,并且需要启发式后处理才能获得向量化地图。为了解决这些挑战,我们引入了一种从端到端的向量化高清地图学习管道,称为VectorMapNet。VectorMapNet接受机载传感器的观测,并在鸟瞰图中预测一组稀疏的折线。该管道可以明确地建模地图元素之间的空间关系,并生成向下自主驾驶任务友好的向量化地图。广泛的实验证明,VectorMapNet在nuScenes和Argoverse2数据集上实现了强大的地图学习性能,比先前最先进的方法提高了14.2 mAP和14.6mAP。定性上,我们还展示了VectorMapNet能够生成全面的地图,并捕捉更精细的路面几何信息。据我们所知,VectorMapNet是第一个从机载观测设计的端到端向量化地图学习工作。我们的项目网站位于https://tsinghua-mars-lab.github.io/vectormapnet/.