制定 " 语言示范培训前培训 " 微调微调:关于名称实体承认的试点研究 (Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition)

Fine-tuning pre-trained language models has recently become a common practice in building NLP models for various tasks, especially few-shot tasks. We argue that under the few-shot setting, formulating fine-tuning closer to the pre-training objectives shall be able to unleash more benefits from the pre-trained language models. In this work, we take few-shot named entity recognition (NER) for a pilot study, where existing fine-tuning strategies are much different from pre-training. We propose a novel few-shot fine-tuning framework for NER, FFF-NER. Specifically, we introduce three new types of tokens, "is-entity", "which-type" and bracket, so we can formulate the NER fine-tuning as (masked) token prediction or generation, depending on the choice of pre-trained language models. In our experiments, we apply FFF-NER to fine-tune both BERT and BART for few-shot NER on several benchmark datasets and observe significant improvements over existing fine-tuning strategies, including sequence labeling, prototype meta-learning, and prompt-based approaches. We further perform a series of ablation studies, showing few-shot NER performance is strongly correlated with the similarity between fine-tuning and pre-training.

翻译：在为各种任务,特别是少数任务建立NLP模型方面,经过训练前的语文模型的微调最近已成为一种常见做法。我们争辩说,在微小的设置下,与训练前的目标更接近的微调应能从经过训练前的语言模型中产生更多的好处。在这项工作中,我们将几个称为实体的识别(NER)用于一项试点研究,而现有的微调战略与培训前的微调战略大不相同。我们建议为NER(FFF-NER)建立一个新的微调框架,略微微调整框架。具体地说,我们引入了三种新型的象征性、“真实性”、“类型”和括号,这样我们就可以根据经过训练前的语言模型的选择,将净化净化的微调作为(大规模)象征性)的预测或代代代代相传。在我们的实验中,我们应用FFF-NER(F-NER)来对几个基准数据集进行微调的微调整,并观察现有微调战略的重大改进,包括序列标签、原型元学习和快速的精确方法。我们进一步进行一系列类似的升级研究。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日