The creation of large-scale open domain reading comprehension data sets in recent years has enabled the development of end-to-end neural comprehension models with promising results. To use these models for domains with limited training data, one of the most effective approach is to first pretrain them on large out-of-domain source data and then fine-tune them with the limited target data. The caveat of this is that after fine-tuning the comprehension models tend to perform poorly in the source domain, a phenomenon known as catastrophic forgetting. In this paper, we explore methods that overcome catastrophic forgetting during fine-tuning without assuming access to data from the source domain. We introduce new auxiliary penalty terms and observe the best performance when a combination of auxiliary penalty terms is used to regularise the fine-tuning process for adapting comprehension models. To test our methods, we develop and release 6 narrow domain data sets that could potentially be used as reading comprehension benchmarks.
翻译:近年来,大规模开放域阅读理解数据集的创建使端到端的神经理解模型得以发展,并取得了有希望的成果。为了将这些模型用于培训数据有限的领域,最有效的办法之一是首先对大型外部源数据进行预先培训,然后用有限的目标数据对其进行微调。这方面的告诫是,在对理解模型进行微调之后,在源域内往往表现不佳,这是一种被称为灾难性遗忘的现象。在本文中,我们探讨了一些方法,这些方法可以克服在微调过程中发生的灾难性遗忘,而不必假定获得来源域的数据。我们引入了新的辅助处罚术语,并在使用辅助处罚术语组合来调整调整调整理解模型的微调过程时,观察最佳表现。为了测试我们的方法,我们开发和发行了6套狭窄的域数据集,这些数据集可以用作阅读理解基准。