为依赖性分析进行多样性软件库主动学习 (Diversity-Aware Batch Active Learning for Dependency Parsing)

While the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert-annotated treebank data, not all annotations contribute equally to the training of the parsers. In this paper, we attempt to reduce the number of labeled examples needed to train a strong dependency parser using batch active learning (AL). In particular, we investigate whether enforcing diversity in the sampled batches, using determinantal point processes (DPPs), can improve over their diversity-agnostic counterparts. Simulation experiments on an English newswire corpus show that selecting diverse batches with DPPs is superior to strong selection strategies that do not enforce batch diversity, especially during the initial stages of the learning process. Additionally, our diversityaware strategy is robust under a corpus duplication setting, where diversity-agnostic sampling strategies exhibit significant degradation.

翻译：虽然现代统计依赖分析员的预测性表现在很大程度上依赖于能否获得昂贵的专家附加说明的树库数据,但并非所有说明都同样有助于对采集员的培训。在本文件中,我们试图通过批量积极学习(AL)来减少培训强有力的依赖分析员所需的贴标签实例数量。我们特别调查利用确定点进程(DPPs)在抽样的批次中实施多样性是否比其多样性-不可知性对口(DPPs)有所改进。英国新闻网络资料库的模拟实验显示,选择不同批次的DPP优于不强制实施批次多样性的强有力的选择战略,特别是在学习过程的初始阶段。此外,我们的多样性意识战略在机构重复环境下是强有力的,在多样性-不可知性抽样战略出现严重退化的地方。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日