Although lane detection methods have shown impressive performance in real-world scenarios, most of methods require post-processing which is not robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR) have been introduced in lane detection. However, one-to-one label assignment in DETR can degrade the training efficiency due to label semantic conflicts. Besides, positional query in DETR is unable to provide explicit positional prior, making it difficult to be optimized. In this paper, we present the One-to-Several Transformer(O2SFormer). We first propose the one-to-several label assignment, which combines one-to-one and one-to-many label assignments to improve the training efficiency while keeping end-to-end detection. To overcome the difficulty in optimizing one-to-one assignment. We further propose the layer-wise soft label which adjusts the positive weight of positive lane anchors across different decoder layers. Finally, we design the dynamic anchor-based positional query to explore positional prior by incorporating lane anchors into positional query. Experimental results show that O2SFormer significantly speeds up the convergence of DETR and outperforms Transformer-based and CNN-based detectors on the CULane dataset. Code will be available at https://github.com/zkyseu/O2SFormer.
翻译:摘要:尽管车道检测方法在实际场景中表现出了令人印象深刻的性能,但大多数方法需要后处理,这种后处理不足够强大。因此,像DEtection TRansformer(DETR)这样的端到端检测器已经被引入车道检测。然而,DETR中的一对一标签分配可能会由于标签语义冲突而降低训练效率。此外,DETR中的位置查询无法提供显式的位置先验,使得其难以进行优化。在本文中,我们提出了One-to-Several Transformer(O2SFormer)。我们首先提出了一对多标签分配,将一对一和一对多标签分配结合起来,以提高训练效率,同时保持端到端检测。为了克服一对一分配的优化难题,我们进一步提出了逐层软标签,该标签调整不同解码器层中正车道锚点的正权重。最后,我们设计了动态基于锚点的位置查询,通过将车道锚点合并到位置查询中来探索位置先验。实验结果表明,O2SFormer显著加速了DETR的收敛速度,并在CULane数据集上优于基于Transformer和CNN的检测器。代码将会在https://github.com/zkyseu/O2SFormer上公开。