Aiming at facilitating a real-world, ever-evolving and scalable autonomous driving system, we present a large-scale dataset for standardizing the evaluation of different self-supervised and semi-supervised approaches by learning from raw data, which is the first and largest dataset to date. Existing autonomous driving systems heavily rely on `perfect' visual perception models (i.e., detection) trained using extensive annotated data to ensure safety. However, it is unrealistic to elaborately label instances of all scenarios and circumstances (i.e., night, extreme weather, cities) when deploying a robust autonomous driving system. Motivated by recent advances of self-supervised and semi-supervised learning, a promising direction is to learn a robust detection model by collaboratively exploiting large-scale unlabeled data and few labeled data. Existing datasets either provide only a small amount of data or covers limited domains with full annotation, hindering the exploration of large-scale pre-trained models. Here, we release a Large-Scale 2D Self/semi-supervised Object Detection dataset for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories. To improve diversity, the images are collected within 27833 driving hours under different weather conditions, periods and location scenes of 32 different cities. We provide extensive experiments and deep analyses of existing popular self/semi-supervised approaches, and give some interesting findings in autonomous driving scope. Experiments show that SODA10M can serve as a promising pre-training dataset for different self-supervised learning methods, which gives superior performance when fine-tuning with different downstream tasks (i.e., detection, semantic/instance segmentation) in autonomous driving domain. More information can refer to https://soda-2d.github.io.
翻译:为促进现实世界、不断演变和可扩展的自主驱动系统,我们展示了一个大型数据集,通过学习原始数据(这是迄今为止第一个和最大的数据集),对各种自我监督的和半监督的方法进行标准化评价。现有的自主驱动系统高度依赖“完美”的视觉认知模型(即检测),而培训了广泛的附加说明的数据,以确保安全。然而,在部署一个强有力的自主驱动系统时,详细标出所有情景和情况的事例(即,夜间、极端天气、城市)是不现实的。受最近自我监督的和半监督的自主驱动方法进展的激励,通过学习原始数据,这是迄今为止第一个和最大的数据集。现有的自主驱动系统在很大程度上依赖于“完美”的视觉认知模型(即检测),或者只提供少量数据,或者覆盖有限的区域,完全注解,从而阻碍大规模前期/培训模式的探索。在这里,我们发布一个大标准2级的自上层/后部的高级和半监督的驱动器运行状况,在自我监督的32个自主驱动的图像中进行广泛的自我检测分析,在10万个自动驱动的图像中,在SOD数据库中,在10万个日历下,在SOD下提供不同的图像。