Most existing Grammatical Error Correction (GEC) methods based on sequence-to-sequence mainly focus on how to generate more pseudo data to obtain better performance. Few work addresses few-shot GEC domain adaptation. In this paper, we treat different GEC domains as different GEC tasks and propose to extend meta-learning to few-shot GEC domain adaptation without using any pseudo data. We exploit a set of data-rich source domains to learn the initialization of model parameters that facilitates fast adaptation on new resource-poor target domains. We adapt GEC model to the first language (L1) of the second language learner. To evaluate the proposed method, we use nine L1s as source domains and five L1s as target domains. Experiment results on the L1 GEC domain adaptation dataset demonstrate that the proposed approach outperforms the multi-task transfer learning baseline by 0.50 $F_{0.5}$ score on average and enables us to effectively adapt to a new L1 domain with only 200 parallel sentences.
翻译:多数现有的基于顺序到顺序的格外错误校正方法(GEC)主要侧重于如何生成更多的伪数据以取得更好的业绩。很少有工作涉及少数的GEC域适应性。在本文件中,我们把不同的GEC域视为不同的GEC任务,并提议将元学习扩大到少数GEC域适应性,而不使用任何伪数据。我们利用一套数据丰富的源域来学习模型参数的初始化,从而便利于在新的资源贫乏目标域上快速适应。我们将GEC模型适应于第二语言学习者的第一语言(L1)。为了评估拟议方法,我们使用9个L1作为源域,5个L1作为目标域。L1GEC域适应数据集的实验结果显示,拟议的方法比多任务转移基线平均高出0.50 $F ⁇ 0.5美元,使我们能够有效地适应新的L1域,只有200个平行句子。