Obtaining large-scale labeled object detection dataset can be costly and time-consuming, as it involves annotating images with bounding boxes and class labels. Thus, some specialized active learning methods have been proposed to reduce the cost by selecting either coarse-grained samples or fine-grained instances from unlabeled data for labeling. However, the former approaches suffer from redundant labeling, while the latter methods generally lead to training instability and sampling bias. To address these challenges, we propose a novel approach called Multi-scale Region-based Active Learning (MuRAL) for object detection. MuRAL identifies informative regions of various scales to reduce annotation costs for well-learned objects and improve training performance. The informative region score is designed to consider both the predicted confidence of instances and the distribution of each object category, enabling our method to focus more on difficult-to-detect classes. Moreover, MuRAL employs a scale-aware selection strategy that ensures diverse regions are selected from different scales for labeling and downstream finetuning, which enhances training stability. Our proposed method surpasses all existing coarse-grained and fine-grained baselines on Cityscapes and MS COCO datasets, and demonstrates significant improvement in difficult category performance.
翻译:获取大规模标记的目标检测数据集成本高昂且耗费时间,因为这需要用边界框和类别标签对图像进行注释。因此,一些专门的主动学习方法已经被提出来通过选择来自未标记数据的粗粒度样本或细粒度实例来减少成本。但是,前者的方法会遭受重复标注的问题,而后者的方法通常会导致训练不稳定和采样偏差。为了解决这些挑战,我们提出了一种名为多尺度区域主动学习(MuRAL)的新方法进行目标检测。MuRAL识别各种尺度的信息丰富的区域,以减少为已学习的对象注释的成本并提高训练性能。信息丰富度分数被设计为考虑实例的预测置信度以及每个对象类别的分布,使我们的方法能够更专注于难以检测的类别。此外,MuRAL采用一个尺度感知的选择策略,以确保从不同尺度选择不同的区域进行标记和下游细调,从而增强训练稳定性。我们提出的方法在Cityscapes和MS COCO数据集上超过了所有现有的粗粒度和细粒度基线,并证明了在难以识别的类别性能方面的显着改进。