Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online.
翻译:以图像为基础的基准数据集推动了计算机视觉任务的发展,例如物体探测、跟踪和环境中物剂的跟踪和分离。但是,大多数自主车辆携带摄像头和激光雷达等射程传感器。随着以机器学习为基础的探测和跟踪方法日益普遍,有必要对包含范围传感器数据和图像的数据集以及图像进行培训和评价。在这项工作中,我们展示了NuTonology场景(nuScenses),第一个包含全自动车辆传感器套装的数据集:6个相机、5个雷达和1里达尔,全部具有360度的视野。nuScenes由1000个场景组成,每20个长和完全附加说明,有23个级和8个属性的3D捆绑框。它有7个说明和100个图像,与创新的KITTI数据集一样多。我们定义了新的3D探测和跟踪指标。我们还提供仔细的数据数据集分析,以及基于Lidar和图像的检测和跟踪基线。数据、开发工具包和更多信息可以在线获得。数据、数据、成套资料和在线。