Current perception models in autonomous driving have become notorious for greatly relying on a mass of annotated data to cover unseen cases and address the long-tail problem. On the other hand, learning from unlabeled large-scale collected data and incrementally self-training powerful recognition models have received increasing attention and may become the solutions of next-generation industry-level powerful and robust perception models in autonomous driving. However, the research community generally suffered from data inadequacy of those essential real-world scene data, which hampers the future exploration of fully/semi/self-supervised methods for 3D perception. In this paper, we introduce the ONCE (One millioN sCenEs) dataset for 3D object detection in the autonomous driving scenario. The ONCE dataset consists of 1 million LiDAR scenes and 7 million corresponding camera images. The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available (e.g. nuScenes and Waymo), and it is collected across a range of different areas, periods and weather conditions. To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset. We conduct extensive analyses on those methods and provide valuable observations on their performance related to the scale of used data. Data, code, and more information are available at https://once-for-auto-driving.github.io/index.html.
翻译:目前自主驱动的认知模型已变得臭名昭著,因为大量附加说明的数据严重依赖大量附带说明的数据来覆盖不为人知的案件并解决长期问题。另一方面,从未贴标签的大规模收集的数据和逐步自我培训的强有力识别模型中学习,已日益受到重视,并可能成为新一代产业一级自主驱动的强大和稳健的认知模型的解决方案。然而,研究界通常因这些基本真实世界现场数据的数据不足而受害,这妨碍了今后对3D认知的完整/半/自我监督方法的探索。在本文件中,我们介绍了在自主驱动情景中用于3D对象探测的OCE(一毫升 SCenEs)数据集。ONCE数据集由100万个LDAR场景和700万个相应相机图像组成。这些数据选自144个驾驶小时,比现有的最大的3D自主驱动数据集(如:nuScenes和Waymo)要长20倍。这些数据收集于不同的领域、时期和天气条件。为了便利今后在3D驱动情景中利用未贴标签的观测数据进行大规模业绩分析,我们在3D相关数据的自我复制和自我更新数据分析时,我们提供了更多使用的相关数据。