Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image-based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first published dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online at www.nuscenes.org.
翻译:以图像为基础的基准数据集推动了计算机视觉任务的发展,例如物体探测、跟踪和环境中物剂的跟踪和分离;然而,大多数自主车辆携带摄像头和激光雷达等射程传感器。随着以机器学习为基础的探测和跟踪方法日益普遍,有必要对包含范围传感器数据的数据集以及图像进行训练和评价。在这项工作中,我们提出了新的3D探测和跟踪指标。我们还提供了谨慎的数据数据集分析以及基于激光雷达和图像的检测和跟踪基线。数据、开发工具包和更多信息可在网上查阅。