Moving Object Segmentation (MOS), a crucial task in computer vision, has numerous applications such as surveillance, autonomous driving, and video analytics. Existing datasets for moving object segmentation mainly focus on RGB or Lidar videos, but lack additional event information that can enhance the understanding of dynamic scenes. To address this limitation, we propose a novel dataset, called DSEC-MOS. Our dataset includes frames captured by RGB cameras embedded on moving vehicules and incorporates event data, which provide high temporal resolution and low-latency information about changes in the scenes. To generate accurate segmentation mask annotations for moving objects, we apply the recently emerged large model SAM - Segment Anything Model - with moving object bounding boxes from DSEC-MOD serving as prompts and calibrated RGB frames, then further revise the results. Our DSEC-MOS dataset contains in total 16 sequences (13314 images). To the best of our knowledge, DSEC-MOS is also the first moving object segmentation dataset that includes event camera in autonomous driving. Project Page: https://github.com/ZZY-Zhou/DSEC-MOS.
翻译:移动物体分割(MOS)是计算机视觉中一项关键任务,具有众多应用,如监控、自动驾驶和视频分析。现有的移动物体分割数据集主要集中在RGB或Lidar视频上,但缺乏其他事件信息,该信息能够增强对动态场景的理解。为了解决这个问题,我们提出了一个新的数据集,叫做DSEC-MOS。我们的数据集包括由嵌入在移动车辆上的RGB相机捕获的帧,并且融合了事件数据,这提供了场景变化的高时间分辨率和低延迟信息。为了生成准确的移动物体分割蒙版注释,我们应用了最近出现的大型模型SAM——分段任何模型(Segment Anything Model),使用来自DSEC-MOD的移动物体边界框作为提示和校准RGB帧,然后进一步修正结果。我们的DSEC-MOS数据集总共包含16个序列(13314张图像)。据我们所知,DSEC-MOS也是第一个在自动驾驶中包括事件相机的移动物体分割数据集。项目页面:https://github.com/ZZY-Zhou/DSEC-MOS。