Active learning automatically selects samples for annotation from a data pool to achieve maximum performance with minimum annotation cost. This is particularly critical for semantic segmentation, where annotations are costly. In this work, we show in the context of semantic segmentation that the data distribution is decisive for the performance of the various active learning objectives proposed in the literature. Particularly, redundancy in the data, as it appears in most driving scenarios and video datasets, plays a large role. We demonstrate that the integration of semi-supervised learning with active learning can improve performance when the two objectives are aligned. Our experimental study shows that current active learning benchmarks for segmentation in driving scenarios are not realistic since they operate on data that is already curated for maximum diversity. Accordingly, we propose a more realistic evaluation scheme in which the value of active learning becomes clearly visible, both by itself and in combination with semi-supervised learning.
翻译:积极学习从数据库中自动选择用于说明的样本,以达到最高性能,并支付最低的注释成本。这对于语义分解尤其关键,因为语义分解费用昂贵。在这项工作中,我们在语义分解方面显示,数据分布对于实现文献中提议的各种积极学习目标具有决定性作用。特别是,在大多数驱动情景和视频数据集中出现的数据冗余作用很大。我们证明,将半受监督的学习与积极学习相结合,在两个目标一致时,可以提高业绩。我们的实验研究表明,当前在驱动情景中进行分解的积极学习基准不现实,因为它们运行的数据已经为最大限度的多样性调控。因此,我们提出了一个更现实的评价计划,即积极学习的价值本身和与半受监督的学习相结合,可以明显可见。