Constructing large-scaled medical knowledge graphs can significantly boost healthcare applications for medical surveillance, bring much attention from recent research. An essential step in constructing large-scale MKG is extracting information from medical reports. Recently, information extraction techniques have been proposed and show promising performance in biomedical information extraction. However, these methods only consider limited types of entity and relation due to the noisy biomedical text data with complex entity correlations. Thus, they fail to provide enough information for constructing MKGs and restrict the downstream applications. To address this issue, we propose Biomedical Information Extraction, a hybrid neural network to extract relations from biomedical text and unstructured medical reports. Our model utilizes a multi-head attention enhanced graph convolutional network to capture the complex relations and context information while resisting the noise from the data. We evaluate our model on two major biomedical relationship extraction tasks, chemical-disease relation and chemical-protein interaction, and a cross-hospital pan-cancer pathology report corpus. The results show that our method achieves superior performance than baselines. Furthermore, we evaluate the applicability of our method under a transfer learning setting and show that BioIE achieves promising performance in processing medical text from different formats and writing styles.
翻译:建立大规模医疗知识图表可以大大提升医疗监督的保健应用,引起近期研究的极大关注。建设大规模MKG的一个重要步骤是从医疗报告中提取信息。最近,提出了信息提取技术,并展示了生物医学信息提取方面的有希望的绩效。然而,这些方法只考虑有限的实体类型,以及由于生物医学文本数据与复杂的实体相互关系的关系。因此,它们未能提供足够的信息来建设MKG,并限制下游应用。为解决这一问题,我们提议建立生物医学信息提取系统,这是一个混合神经网络,从生物医学文本和非结构医学报告中提取关系。我们的模型利用多头目强化的图象革命网络来捕捉复杂的关系和背景信息,同时抵制数据中的噪音。我们评估了我们关于两大主要生物医学关系提取任务、化学-疾病关系和化学-蛋白相互作用的模型,以及跨医院的锅癌病理报告材料。结果显示,我们的方法比基线的绩效要高。此外,我们评估了我们方法在转移学习设置和不结构医学格式下的应用性,并显示生物生命研究所在不同的医学处理中取得了有希望的业绩。