利用基于变压器的模型采掘临床关系 (Clinical Relation Extraction Using Transformer-based Models)

The newly emerged transformer technology has a tremendous impact on NLP research. In the general English domain, transformer-based models have achieved state-of-the-art performances on various NLP benchmarks. In the clinical domain, researchers also have investigated transformer models for clinical applications. The goal of this study is to systematically explore three widely used transformer-based models (i.e., BERT, RoBERTa, and XLNet) for clinical relation extraction and develop an open-source package with clinical pre-trained transformer-based models to facilitate information extraction in the clinical domain. We developed a series of clinical RE models based on three transformer architectures, namely BERT, RoBERTa, and XLNet. We evaluated these models using 2 publicly available datasets from 2018 MADE1.0 and 2018 n2c2 challenges. We compared two classification strategies (binary vs. multi-class classification) and investigated two approaches to generate candidate relations in different experimental settings. In this study, we compared three transformer-based (BERT, RoBERTa, and XLNet) models for relation extraction. We demonstrated that the RoBERTa-clinical RE model achieved the best performance on the 2018 MADE1.0 dataset with an F1-score of 0.8958. On the 2018 n2c2 dataset, the XLNet-clinical model achieved the best F1-score of 0.9610. Our results indicated that the binary classification strategy consistently outperformed the multi-class classification strategy for clinical relation extraction. Our methods and models are publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerRelationExtraction. We believe this work will improve current practice on clinical relation extraction and other related NLP tasks in the biomedical domain.

翻译：新兴变压器技术对NLP研究产生了巨大影响。在一般的英国域中,基于变压器的模型在各种NLP基准上取得了最先进的表现。在临床领域,研究人员还调查了临床应用的变压器模型。本研究的目的是系统探索三种广泛使用的变压器模型(即BERT、ROBERTA和XLNet)用于临床关系提取,并开发一个开放源码包,配有临床培训前变压模型,以便利临床领域的信息提取。我们根据三种变压器结构(即BERT、ROBERTA和XLNet)开发了一系列的临床RE模型。我们利用2018年MAR1.0和2018 n2c挑战的公开数据集对这些模型进行了评估。我们比较了两种分类战略(即双轨对多级分类),并调查了两种在不同实验环境中产生候选关系的方法。我们比较了三种变压器的多级变压器模型(BERTA、ROBERTA和XLNet)的多级变压式模型。我们用的是2018年的变压式的变压式的变压式战略,我们用了2018SOLS-S-ExLFROTFROLS-S-SDSDSDSDSDS 的模型实现了的模型的模型, 。我们实现了了2018SDS-S-S-S-S-S-SDSDS-S-S-S-S-S-S-S-S-S-SDSDSDSDSDSDSDSDSDSDSDSDSDSDR 的模型的SDSDSD 。我们实现了了20181818S-SDSDSDSDS-S-S-S-S-S-SDS-S-S-S-SDSDS-SDSD 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日