Supervised object detection has been proven to be successful in many benchmark datasets achieving human-level performances. However, acquiring a large amount of labeled image samples for supervised detection training is tedious, time-consuming, and costly. In this paper, we propose an efficient image selection approach that samples the most informative images from the unlabeled dataset and utilizes human-machine collaboration in an iterative train-annotate loop. Image features are extracted by the CNN network followed by the similarity score calculation, Euclidean distance. Unlabeled images are then sampled into different approaches based on the similarity score. The proposed approach is straightforward, simple and sampling takes place prior to the network training. Experiments on datasets show that our method can reduce up to 80% of manual annotation workload, compared to full manual labeling setting, and performs better than random sampling.
翻译:监督对象的探测在许多基准数据集中被证明是成功的,这些基准数据集取得了人类层面的性能。然而,获得大量标签图像样本供监督检测培训使用是乏味的、耗时的和昂贵的。在本文中,我们建议一种高效的图像选择方法,从未标数据集中抽取信息最丰富的图像,并在一个迭代列火车-注解循环中利用人体机械协作。图像特征由CNN网络提取,随后进行类似评分的计算,即Euclidean距离。然后根据相似性评分将无标签图像采样成不同的方法。拟议的方法在网络培训之前就直截了当、简单和取样。对数据集的实验表明,与完整的手工标签设置相比,我们的方法可以减少80%的人工批注工作量,并且比随机抽样要好。