Relation extraction (RE) consists in identifying and structuring automatically relations of interest from texts. Recently, BERT improved the top performances for several NLP tasks, including RE. However, the best way to use BERT, within a machine learning architecture, and within a transfer learning strategy is still an open question since it is highly dependent on each specific task and domain. Here, we explore various BERT-based architectures and transfer learning strategies (i.e., frozen or fine-tuned) for the task of biomedical RE on two corpora. Among tested architectures and strategies, our *BERT-segMCNN with finetuning reaches performances higher than the state-of-the-art on the two corpora (1.73 % and 32.77 % absolute improvement on ChemProt and PGxCorpus corpora respectively). More generally, our experiments illustrate the expected interest of fine-tuning with BERT, but also the unexplored advantage of using structural information (with sentence segmentation), in addition to the context classically leveraged by BERT.
翻译:最近,BERT改进了包括RE在内的若干NLP任务的最高性能。然而,在机器学习结构和转让学习战略中,使用BERT的最佳方法仍然是一个未决问题,因为它高度依赖每项具体任务和领域。在这里,我们探索了各种基于BERT的架构和转移学习战略(即冻结或微调),以完成生物医学再教育在两个公司上的任务。在经过测试的架构和战略中,我们进行微调的 *BERT-segMCNN除了BERT以经典方式利用的环境之外,还取得了高于在两个公司(分别是ChemProt和PGxCorpus公司(分别为1.73%和32.77%的绝对改进率)的业绩。