ImitAL:从合成数据学习积极的学习战略 (ImitAL: Learning Active Learning Strategies from Synthetic Data)

One of the biggest challenges that complicates applied supervised machine learning is the need for huge amounts of labeled data. Active Learning (AL) is a well-known standard method for efficiently obtaining labeled data by first labeling the samples that contain the most information based on a query strategy. Although many methods for query strategies have been proposed in the past, no clear superior method that works well in general for all domains has been found yet. Additionally, many strategies are computationally expensive which further hinders the widespread use of AL for large-scale annotation projects. We, therefore, propose ImitAL, a novel query strategy, which encodes AL as a learning-to-rank problem. For training the underlying neural network we chose Imitation Learning. The required demonstrative expert experience for training is generated from purely synthetic data. To show the general and superior applicability of \ImitAL{}, we perform an extensive evaluation comparing our strategy on 15 different datasets, from a wide range of domains, with 10 different state-of-the-art query strategies. We also show that our approach is more runtime performant than most other strategies, especially on very large datasets.

翻译：使应用监督的机器学习变得复杂的最大挑战之一是需要大量标签数据。积极学习(AL)是一种众所周知的标准方法,通过首先给含有基于查询战略的最大部分信息的样本贴上标签,从而有效获取标签数据。虽然过去曾提出过许多查询战略方法,但还没有找到对所有领域都行之有效的明显优异方法。此外,许多战略都是计算成本高昂的,这进一步阻碍了将AL广泛用于大型批注项目。因此,我们提出了ImitAL,这是一个新颖的查询战略,将AL编码为学习到排序问题。为培训基础神经网络,我们选择了光学学习。所需要的示范性专家培训经验来自纯合成数据。为了显示\Imital ⁇ 的一般和优越适用性,我们进行了广泛的评价,比较了我们关于15个不同数据集的战略,来自广泛的领域,有10种不同的状态查询战略。我们还表明,我们的方法比大多数其他战略,特别是非常大型的数据集,更具有时间性。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。