Detecting 3D lanes from the camera is a rising problem for autonomous vehicles. In this task, the correct camera pose is the key to generating accurate lanes, which can transform an image from perspective-view to the top-view. With this transformation, we can get rid of the perspective effects so that 3D lanes would look similar and can accurately be fitted by low-order polynomials. However, mainstream 3D lane detectors rely on perfect camera poses provided by other sensors, which is expensive and encounters multi-sensor calibration issues. To overcome this problem, we propose to predict 3D lanes by estimating camera pose from a single image with a two-stage framework. The first stage aims at the camera pose task from perspective-view images. To improve pose estimation, we introduce an auxiliary 3D lane task and geometry constraints to benefit from multi-task learning, which enhances consistencies between 3D and 2D, as well as compatibility in the above two tasks. The second stage targets the 3D lane task. It uses previously estimated pose to generate top-view images containing distance-invariant lane appearances for predicting accurate 3D lanes. Experiments demonstrate that, without ground truth camera pose, our method outperforms the state-of-the-art perfect-camera-pose-based methods and has the fewest parameters and computations. Codes are available at https://github.com/liuruijin17/CLGo.
翻译:从相机中检测 3D 车道对自动车辆来说是一个越来越严重的问题。 在这一任务中,正确的相机是生成准确的车道的关键,它可以将图像从视角视图转换为上视图。 通过这种转换,我们可以摆脱视角效应,使3D 车道看起来相似,并且可以准确地由低序多式多功能学习来适应。然而,主流 3D 车道探测器依赖于由其他传感器提供的完美相机构成,这些传感器费用昂贵,并遇到多传感器校准问题。为了解决这一问题,我们提议通过从一个带有两阶段框架的单一图像中估算相机的位置来预测 3D 车道。 第一阶段的目标是将镜头的图像从视角视图图像转换为顶部任务。 为了改进外观,我们引入了辅助的 3D 车道任务和地理测量限制,以多功能学习为受益,这加强了3D 和 2D 之间的组合,以及上述两项任务的兼容性。 第二阶段针对的是 3D 车道任务。 为了克服这一问题,我们之前估计的姿势将生成含有远程变量的相机行道图像图像, 以视角显示视野的镜头显示来自视图视图图像图像的图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像图像的图像的图像/, 用于预测 3D- 方向的路径的精确的校正版版版版式的路径/ 。