Autonomous driving systems require a good understanding of surrounding environments, including moving obstacles and static High-Definition (HD) semantic map elements. Existing methods approach the semantic map problem by offline manual annotation, which suffers from serious scalability issues. Recent learning-based methods produce dense rasterized segmentation predictions to construct maps. However, these predictions do not include instance information of individual map elements and require heuristic post-processing, that involves many hand-designed components, to obtain vectorized maps. To that end, we introduce an end-to-end vectorized HD map learning pipeline, termed VectorMapNet. VectorMapNet takes onboard sensor observations and predicts a sparse set of polylines primitives in the bird's-eye view to model the geometry of HD maps. This pipeline can explicitly model the spatial relation between map elements and generate vectorized maps that are friendly to downstream autonomous driving tasks without the need for post-processing. In our experiments, VectorMapNet achieves strong HD map learning performance on nuScenes dataset, surpassing previous state-of-the-art methods by 14.2 mAP. Qualitatively, we also show that VectorMapNet is capable of generating comprehensive maps and capturing more fine-grained details of road geometry. To the best of our knowledge, VectorMapNet is the first work designed toward end-to-end vectorized HD map learning problems. Our project website is available at https://tsinghua-mars-lab.github.io/vectormapnet/.
翻译:自主驱动系统需要充分了解周围环境,包括移动障碍物和静态高定义(HD)语义地图元素; 现有方法通过离线手动说明处理语义地图问题,这有严重的可缩缩性问题; 最近的学习方法产生密集的光化分解预测,以构建地图; 然而,这些预测并不包括单个地图元素的实例信息,而需要由手设计的许多部件组成的超速后处理。 为此,我们引入了终端到终端的HD矢量地图学习管道,称为VectorMapNet。 VectorMapNet在机上进行传感器观测,并预测在鸟眼中少有一组原始的多线以模拟HD地图的几何测量。 这个管道可以明确地建模地图元素之间的空间关系和生成对下游自主驱动任务友好的矢量式地图,而不需要后处理。 在我们的实验中,VectorMapNet实现了在 nuScenereet数据设置上强有力的HD地图学习表现。 超过鸟眼视图的原始数据, 并显示我们最有能力的Mart网站 14 的地图。