The field of natural language processing (NLP) has recently seen a large change towards using pre-trained language models for solving almost any task. Despite showing great improvements in benchmark datasets for various tasks, these models often perform sub-optimal in non-standard domains like the clinical domain where a large gap between pre-training documents and target documents is observed. In this paper, we aim at closing this gap with domain-specific training of the language model and we investigate its effect on a diverse set of downstream tasks and settings. We introduce the pre-trained CLIN-X (Clinical XLM-R) language models and show how CLIN-X outperforms other pre-trained transformer models by a large margin for ten clinical concept extraction tasks from two languages. In addition, we demonstrate how the transformer model can be further improved with our proposed task- and language-agnostic model architecture based on ensembles over random splits and cross-sentence context. Our studies in low-resource and transfer settings reveal stable model performance despite a lack of annotated data with improvements of up to 47 F1 points when only 250 labeled sentences are available. Our results highlight the importance of specialized language models as CLIN-X for concept extraction in non-standard domains, but also show that our task-agnostic model architecture is robust across the tested tasks and languages so that domain- or task-specific adaptations are not required.
翻译:自然语言处理领域(NLP)最近出现了巨大的变化,转向使用预先培训的语言模型来解决几乎任何任务。尽管在各种任务的基准数据集方面有了巨大改进,但这些模型往往在临床领域等非标准领域执行亚最佳性,如临床领域,在临床领域,培训前文件和目标文件之间存在巨大差距。在本文件中,我们的目标是缩小语言模型的具体领域培训差距,并调查其对一系列不同下游任务和设置的影响。我们引入了预先培训的 CLIN-X(临床 XLM-R)语言模型,并展示了CLIN-X如何通过从两种语言提取10项临床概念任务的巨大余地,将经过预先培训的其他变异模型转化为其他模型。此外,我们展示了如何通过我们拟议的基于随机分割和交叉背景的任务和语言的模型架构来进一步改进变异变模型模型模型模型。我们在低资源和传输环境中的研究显示稳定的业绩,尽管缺乏附加说明的数据,在47种F1点上作了改进,而在仅有250种C标签的域架构中,也显示我们的标准结构任务没有显示我们的标准格式任务的成果。