Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples. While uncertainty-based strategies are susceptible to outliers, solely relying on sample diversity does not capture the information available on the main task. In this work, we develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner. Our model consists of an entropy minimizing feature encoding network followed by an entropy maximizing classification layer. This minimax formulation reduces the distribution gap between the labeled/unlabeled data, while a discriminator is simultaneously trained to distinguish the labeled/unlabeled data. The highest entropy samples from the classifier that the discriminator predicts as unlabeled are selected for labeling. We evaluate our method on various image classification and semantic segmentation benchmark datasets and show superior performance over the state-of-the-art methods.
翻译:主动学习的目的是通过查询最有代表性的样本来开发标签效率的算法,这些样本将被人类代言人贴上标签。当前主动学习技术要么依靠模型不确定性来选择最不确定的样本,要么利用集成或重建来选择最多样化的无标签实例。虽然基于不确定性的战略很容易被外部线人所利用,但仅仅依靠样本多样性并不能捕捉到主要任务上可获得的信息。在这项工作中,我们开发了一种半监督的迷你式迷你式酶基积极学习算法,以对抗方式利用不确定性和多样性。我们的模型包括一种最小化最小化特性编码网络,然后是酶最大化分类层。这种微型模型减少了标签/无标签数据之间的分布差距,同时训练了歧视者区分标签/无标签数据。从分类器中选择了歧视者预测为未加标签的最高的诱变样本,用于标签。我们评估了各种图像分类和语义分类基准数据集的方法,并展示了州-艺术方法的高级性。