Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on large collections of labeled data. Among the existing solutions, deep active learning is currently witnessing a major interest and its purpose is to train deep networks using as few labeled samples as possible. However, the success of active learning is highly dependent on how critical are these samples when training models. In this paper, we devise a novel active learning approach for label-efficient training. The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria. The proposed approach is probabilistic and unifies all these criteria in a single objective function whose solution models the probability of relevance of samples (i.e., how critical) when learning a decision function. We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration, using a particular stateless Q-learning model. Extensive experiments conducted on staple image classification data, including Object-DOTA, show the effectiveness of our proposed model w.r.t. several baselines including random, uncertainty and flat as well as other work.
翻译:现有大多数学习模式,特别是深度神经网络,都依赖于大型数据集,其手贴标签费用昂贵,时间紧迫。目前的趋势是使这些模型的学习变得节俭,不那么依赖大量贴标签的数据收集。在现有的解决办法中,深积极学习目前具有重大意义,其目的是用尽可能少的贴标签的样本来培训深网络。然而,积极学习的成功在很大程度上取决于这些样本在培训模式中的重要性。在本文件中,我们为标签效率高的培训设计了一种新的积极学习方法。拟议的方法是迭代性的,目的是最大限度地减少一个限制的客观功能,该功能将多样性、代表性和不确定性标准混杂在一起。拟议的方法是概率性的,并将所有这些标准统一在一个单一的客观功能中,其解决办法模型在学习决定功能时的关联性(即,多么关键)。我们还采用了一种基于强化学习的新加权机制,在每次培训中使用特定的无国籍的Q-学习模式时,在这些标准中,适应性地平衡了这些标准。对主机图图像数据进行了广泛的实验,包括若干个目标-DOTA,以及作为随机基线,展示了我们所提出的其他工作的有效性。