Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image-based benchmark datasets have driven the development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We also define a new metric for 3D detection which consolidates the multiple aspects of the detection task: classification, localization, size, orientation, velocity and attribute estimation. We provide careful dataset analysis as well as baseline performance for lidar and image based detection methods. Data, development kit, and more information are available at www.nuscenes.org.
翻译:以图像为基础的基准数据集驱动了计算机视觉任务的发展,例如环境中的物剂的物体探测、跟踪和分离。但是,大多数自主车辆携带摄像头和测距传感器,例如激光雷达和雷达。随着基于机器的探测和跟踪方法越来越普遍,有必要对包含范围传感器数据的数据集以及图像进行训练和评估。在这项工作中,我们提出了包含全自动车辆传感器成套设备的第一组数据:6个照相机、5个雷达机和1个激光雷达机,全部有360度的视野。NuScenes由1000个场景组成,每20个长和完全加注,有23个等级和8个属性的3D捆绑框。它有7个说明和100倍多的图像作为开创性KITTI数据集。我们还为3D探测确定了新的指标,它综合了探测任务的多个方面:分类、本地化、大小、方向、速度和属性估计。我们提供了精确的数据分析,作为基线和图像分析。我们提供了以www为基础的数据分析。