生物医学自然语言处理的精细设计大型神经语言模型 (Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing)

Motivation: A perennial challenge for biomedical researchers and clinical practitioners is to stay abreast with the rapid growth of publications and medical notes. Natural language processing (NLP) has emerged as a promising direction for taming information overload. In particular, large neural language models facilitate transfer learning by pretraining on unlabeled text, as exemplified by the successes of BERT models in various NLP applications. However, fine-tuning such models for an end task remains challenging, especially with small labeled datasets, which are common in biomedical NLP. Results: We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains. Large models have potential to attain better performance, but increasing model size also exacerbates finetuning instability. We thus conduct a comprehensive exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT-BASE models, while layerwise decay is more effective for BERT-LARGE and ELECTRA models. For low-resource text similarity tasks such as BIOSSES, reinitializing the top layer is the optimal strategy. Overall, domainspecific vocabulary and pretraining facilitate more robust models for fine-tuning. Based on these findings, we establish new state of the art on a wide range of biomedical NLP applications. Availability and implementation: To facilitate progress in biomedical NLP, we release our state-of-the-art pretrained and fine-tuned models: https://aka.ms/BLURB.

翻译：激励:生物医学研究人员和临床医生的长期挑战是跟上出版物和医学说明的快速增长。自然语言处理(NLP)已成为抑制信息超负荷的一个有希望的方向。特别是,大型神经语言模型通过无标签文本的预培训促进转让学习,无标签文本的预培训就是例证。但是,为最终任务而微调这种模型仍然具有挑战性,特别是在生物医学NLP中常见的小标签数据集方面。结果:我们对生物医学NLP的稳定性进行微调的系统研究。我们表明,微调应用对培训前的设置可能很敏感,特别是在低资源领域。大型神经语言模型有可能取得更好的业绩,但增加模型大小也会加剧微调不稳定性能。因此,我们全面探索解决不稳定性微调的技术。我们表明,这些技术可以大大改进低资源生物医学NLP应用程序的微调效性能。具体化低层层对标准的BERT-BASE模型很有帮助,而层腐蚀对于BER-ARG-ARGE的精度模型和REAST-S-BR-Sirstal Streal Stal Studal Studal State State Studal State State State Studal Bas:我们在B-BReal-BRislate Studental Breal State State State State State State State Stateslemal Stal Stal-SB 和B State Stal Strolmal-B Strolmal 和B.SBBBBBBBBRM_基础基础基础基础基础基础基础基础上更有效。和BBRBRIB 和BRBRBRMI 和基础基础基础基础基础基础基础上,这些基础基础基础基础基础基础基础上,这些基础上更精度上,这些基础基础基础基础和基础和基础的升级的模型,这是更精度的升级的升级的升级的模型和基础和基础的升级的升级的升级的升级的模型,这些基础的升级化的升级的升级的升级的模型,这是BRBRBRIBRIBRBRBRBRBRBRBRBRBRIBRBRBRBRBRBRBRBRB