Monocular 3D lane detection is a challenging task due to its lack of depth information. A popular solution to 3D lane detection is to first transform the front-viewed (FV) images or features into the bird-eye-view (BEV) space with inverse perspective mapping (IPM) and detect lanes from BEV features. However, the reliance of IPM on flat ground assumption and loss of context information makes it inaccurate to restore 3D information from BEV representations. An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes. In this paper, we define 3D lane anchors in the 3D space and propose a BEV-free method named Anchor3DLane to predict 3D lanes directly from FV representations. 3D lane anchors are projected to the FV features to extract their features which contain both good structural and context information to make accurate predictions. We further extend Anchor3DLane to the multi-frame setting to incorporate temporal information for performance improvement. In addition, we also develop a global optimization method that makes use of the equal-width property between lanes to reduce the lateral error of predictions. Extensive experiments on three popular 3D lane detection benchmarks show that our Anchor3DLane outperforms previous BEV-based methods and achieves state-of-the-art performances.
翻译:由于缺少深度信息,3D车道探测是一项具有挑战性的任务。 3D车道探测的流行解决方案是首先将前视图像或特征转换成三眼车道(BEV)空间,进行反视角映射(IPM),并从BEV地貌探测车道。然而,IPM依靠平坦地面假设和失去背景信息,无法准确地从BEV代表处恢复3D信息。试图从FV代表处直接清除BEV并预测3D车道,同时由于3D车道缺乏结构代表,它仍然低于其他BEV方法。在本文件中,我们界定了3D车道的三D车道锚,并提议了一个名为Anchor3DLane的无BEV方法,以直接从FV代表处预测3D车道。 3D车道锚被预测为FV地貌特征,既包含良好的结构和背景信息,也直接预测3D车道,以便作出准确的预测。我们进一步将Anchor3D车道应用多框架设置,以纳入三D车道定位的仪型定位,以纳入BVSeral-Serview3的进度测试基准。