Lane detection plays a key role in autonomous driving. While car cameras always take streaming videos on the way, current lane detection works mainly focus on individual images (frames) by ignoring dynamics along the video. In this work, we collect a new video instance lane detection (VIL-100) dataset, which contains 100 videos with in total 10,000 frames, acquired from different real traffic scenarios. All the frames in each video are manually annotated to a high-quality instance-level lane annotation, and a set of frame-level and video-level metrics are included for quantitative performance evaluation. Moreover, we propose a new baseline model, named multi-level memory aggregation network (MMA-Net), for video instance lane detection. In our approach, the representation of current frame is enhanced by attentively aggregating both local and global memory features from other frames. Experiments on the new collected dataset show that the proposed MMA-Net outperforms state-of-the-art lane detection methods and video object segmentation methods. We release our dataset and code at https://github.com/yujun0-0/MMA-Net.
翻译:虽然汽车摄像机总是在路上拍摄流视频,但目前的航道探测工作主要侧重于个人图像(框架),忽略视频的动态。在这项工作中,我们收集了新的视频实例航道探测(VIL-100)数据集,其中包含100个视频,总共10 000个框架,从不同的真实交通情景中获取。每个视频中的所有框架都是人工加注的,用于高质量的实例级车道注释,并包含一套框架级和视频级衡量标准,用于定量绩效评估。此外,我们提出了一个新的基线模型,称为多级内存汇总网络(MMA-Net),用于视频航道探测。在我们的方法中,通过从其他框架中仔细汇集当地和全球的记忆特征,加强了当前框架的表述。对新收集的数据集的实验显示,拟议的MMA-Net超越州-艺术级车道探测方法和视频对象分割方法。我们在https://github.com/yujun-0/MMA-Net上公布了我们的数据设置和代码。