Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper proposes a novel and general framework called PriorLane, which is used to enhance the segmentation performance of the fully vision transformer by introducing the low-cost local prior knowledge. Specifically, PriorLane utilizes an encoder-only transformer to fuse the feature extracted by a pre-trained segmentation model with prior knowledge embeddings. Note that a Knowledge Embedding Alignment (KEA) module is adapted to enhance the fusion performance by aligning the knowledge embedding. Extensive experiments on our Zjlab dataset show that PriorLane outperforms SOTA lane detection methods by a 2.82% mIoU when prior knowledge is employed.
翻译:干道探测是自驾驶的基本模块之一。 在本文中,我们采用了一种只使用变压器的车道探测方法,这样它就可以从全视变压器的蓬勃发展中受益,并且通过微调在大型数据集上经过充分预先训练的重量,实现CULane 和 TuSemple 基准方面的最先进的(SOTA)性能。更重要的是,本文件提出了一个名为PeopleLane 的新颖和一般框架,用于通过引入低成本的当地先前知识,提高全视变压器的分层性能。具体地说,PeopleLane 使用一种只使用编码器的变压器,将事先经过训练的分解模型所提取的功能与先前的知识嵌入的功能结合起来。 注意,通过对知识嵌入式知识模块进行调整,以提高聚变性能。 有关我们的Zjlab数据集的广泛实验显示,在使用先前的知识时,Pirelane 超越了2.82 mIoU的SOTA线探测方法。