Recent years have witnessed huge successes in 3D object detection to recognize common objects for autonomous driving (e.g., vehicles and pedestrians). However, most methods rely heavily on a large amount of well-labeled training data. This limits their capability of detecting rare fine-grained objects (e.g., police cars and ambulances), which is important for special cases, such as emergency rescue, and so on. To achieve simultaneous detection for both common and rare objects, we propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes. Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset. To solve this task, we propose a simple and effective detection framework, including (1) an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects, and (2) a sample adaptive balance loss to alleviate the issue of long-tailed data distribution in autonomous driving scenarios. On the nuScenes dataset, we conduct sufficient experiments to demonstrate that our approach can successfully detect the rare (novel) classes that contain only a few training data, while also maintaining the detection accuracy of common objects.
翻译:近年来,在3D物体探测方面取得了巨大成功,以识别用于自主驾驶的通用物体(如车辆和行人),然而,大多数方法都严重依赖大量贴有良好标签的培训数据。这限制了它们探测稀有微粒物体(如警车和救护车)的能力,而这是紧急情况救援等特殊情况的重要手段。为了同时探测普通和稀有物体,我们提议了一项新颖的任务,称为 " 通用微粒3D物体探测 ",其中我们拥有大量用于通用(基地)物体的培训数据,但只有很少的(新)类数据。具体地说,我们深入分析图像和点云之间的差异,然后为3DLDAR数据集中的微粒设置提供一个实用的原则。为了解决这个问题,我们提出了一个简单而有效的探测框架,包括:(1) 一种渐进的微调方法,以扩展现有的3D探测模型,以识别普通和稀有物体,以及(2) 一种抽样的适应平衡损失,以缓解在自主驱动情景中长期扩展的数据分布的问题。我们深入分析图像和点云值云,然后为三DLDAR数据集的正确性测试方法只能成功进行。