Despite rapid developments in the field of machine learning research, collecting high-quality labels for supervised learning remains a bottleneck for many applications. This difficulty is exacerbated by the fact that state-of-the-art models for NLP tasks are becoming deeper and more complex, often increasing the amount of training data required even for fine-tuning. Weak supervision methods, including data programming, address this problem and reduce the cost of label collection by using noisy label sources for supervision. However, until recently, data programming was only accessible to users who knew how to program. To bridge this gap, the Data Programming by Demonstration framework was proposed to facilitate the automatic creation of labeling functions based on a few examples labeled by a domain expert. This framework has proven successful for generating high-accuracy labeling models for document classification. In this work, we extend the DPBD framework to span-level annotation tasks, arguably one of the most time-consuming NLP labeling tasks. We built a novel tool, TagRuler, that makes it easy for annotators to build span-level labeling functions without programming and encourages them to explore trade-offs between different labeling models and active learning strategies. We empirically demonstrated that an annotator could achieve a higher F1 score using the proposed tool compared to manual labeling for different span-level annotation tasks.
翻译:尽管在机器学习研究领域取得了迅速发展,但为监督学习收集高质量标签仍然是许多应用的瓶颈。这一困难由于以下事实而更加严重:国家劳工政策局任务的最新模型越来越深入和复杂,常常增加甚至微调所需的培训数据数量。包括数据编程在内的监督方法薄弱,解决这一问题,并通过使用噪音标签源进行监管来降低标签收集成本。然而,直到最近,数据编程只能让知道如何编程的用户使用。为弥合这一差距,提议示范数据编程框架是为了便利在域专家标出的几个例子的基础上自动创建标签功能。这一框架已证明成功地产生了文件分类所需的高准确性标签模式。在这项工作中,我们扩大了DPBD框架,将跨级别说明任务扩大到跨级别,可以说是使用最费时的标签源进行监管的任务之一。我们建立了一个新的工具,即Tag Lower,它使标注人员容易在不编程的情况下建立跨级别标签功能,并鼓励他们探索不同标签模式之间的交易,而不是用不同等级的手动学习战略进行比较。