通过 Open Set 识别进行深层活动学习 (Deep Active Learning via Open Set Recognition)

In many applications, data is easy to acquire but expensive and time-consuming to label prominent examples include medical imaging and NLP. This disparity has only grown in recent years as our ability to collect data improves. Under these constraints, it makes sense to select only the most informative instances from the unlabeled pool and request an oracle (e.g., a human expert) to provide labels for those samples. The goal of active learning is to infer the informativeness of unlabeled samples so as to minimize the number of requests to the oracle. Here, we formulate active learning as an open-set recognition problem. In this paradigm, only some of the inputs belong to known classes; the classifier must identify the rest as unknown. More specifically, we leverage variational neural networks (VNNs), which produce high-confidence (i.e., low-entropy) predictions only for inputs that closely resemble the training data. We use the inverse of this confidence measure to select the samples that the oracle should label. Intuitively, unlabeled samples that the VNN is uncertain about are more informative for future training. We carried out an extensive evaluation of our novel, probabilistic formulation of active learning, achieving state-of-the-art results on MNIST, CIFAR-10, and CIFAR-100. Additionally, unlike current active learning methods, our algorithm can learn tasks without the need for task labels. As our experiments show, when the unlabeled pool consists of a mixture of samples from multiple datasets, our approach can automatically distinguish between samples from seen vs. unseen tasks.

翻译：在许多应用中,数据容易获取,但费用昂贵且耗时费时,标签突出的例子包括医疗成像和NLP。随着我们收集数据的能力的提高,这种差异在最近几年才随着数据采集能力的提高而扩大。在这些限制下,我们从未贴标签的库中只选择信息最丰富的实例,并要求一个神器(例如,一位人类专家)为这些样品提供标签是有道理的。积极学习的目的是推断未贴标签的样品的丰富性,以便尽可能减少对甲骨文的要求数量。在这里,我们将积极学习作为开放的识别问题。在这个模式中,只有部分输入属于已知的类别;在这种模式中,分类者必须确定其余部分为未知的。更具体地说,我们利用变异性神经网络(VNNNFs)来选择最丰富的实例,要求一个神器(例如,一位人类专家)来为这些样品提供标签。我们使用这种信任度的反向量度来选择对甲骨骼进行标记的样品。我们非标签的方法可以直截然地区分非标签的样品。 VNNFN的样品是属于已知的,对已知的样品属于已知的类别的,对于未来的样品的样本进行不确证培训的,我们需要学习的,我们进行中的新的,我们进行着一个动态的,我们进行着样的,我们进行着一个动态的,我们进行着样式的飞行的研判的研判。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

专知会员服务

39+阅读 · 2020年11月3日

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集