Unlabeled data examples awaiting annotations contain open-set noise inevitably. A few active learning studies have attempted to deal with this open-set noise for sample selection by filtering out the noisy examples. However, because focusing on the purity of examples in a query set leads to overlooking the informativeness of the examples, the best balancing of purity and informativeness remains an important question. In this paper, to solve this purity-informativeness dilemma in open-set active learning, we propose a novel Meta-Query-Net,(MQ-Net) that adaptively finds the best balancing between the two factors. Specifically, by leveraging the multi-round property of active learning, we train MQ-Net using a query set without an additional validation set. Furthermore, a clear dominance relationship between unlabeled examples is effectively captured by MQ-Net through a novel skyline regularization. Extensive experiments on multiple open-set active learning scenarios demonstrate that the proposed MQ-Net achieves 20.14% improvement in terms of accuracy, compared with the state-of-the-art methods.
翻译:等待注释的未贴标签的数据示例不可避免地含有开放设置的噪音。 一些积极的学习研究试图通过过滤吵闹的例子来处理这种开放设置的抽样选择噪音。 但是,由于侧重于查询集中实例的纯度,导致忽略了实例的丰富性,因此,最佳平衡纯度和知识度仍然是一个重要问题。 在本文件中,为了在开放设置的积极学习中解决这种纯度信息化难题,我们提出了一个新颖的Meta-Query-Net (MQ-Net), 适应性地找到两个因素之间的最佳平衡。 具体地说,我们利用积极学习的多面性能来培训MQ-Net,使用一个查询集,而没有额外的验证集。此外,MQ-Net通过新的天线正规化,有效地掌握了未贴标签实例之间的明确支配地位关系。 在多个开放设置的积极学习情景上进行的广泛实验表明,拟议的MQ-Net在准确性方面实现了20.14%的改进,与最先进的方法相比。