Biomedical event extraction is critical in understanding biomolecular interactions described in scientific corpus. One of the main challenges is to identify nested structured events that are associated with non-indicative trigger words. We propose to incorporate domain knowledge from Unified Medical Language System (UMLS) to a pre-trained language model via Graph Edge-conditioned Attention Networks (GEANet) and hierarchical graph representation. To better recognize the trigger words, each sentence is first grounded to a sentence graph based on a jointly modeled hierarchical knowledge graph from UMLS. The grounded graphs are then propagated by GEANet, a novel graph neural networks for enhanced capabilities in inferring complex events. On BioNLP 2011 GENIA Event Extraction task, our approach achieved 1.41% F1 and 3.19% F1 improvements on all events and complex events, respectively. Ablation studies confirm the importance of GEANet and hierarchical KG.
翻译:生物医学事件提取对于了解科学文献中描述的生物分子相互作用至关重要。主要挑战之一是确定与非指示性触发词有关的嵌套结构化事件。我们提议通过George-accessedocility Networks(GEANet)和分级图示,将统一医疗语言系统(UMLS)的域知识纳入预先培训的语言模式。为了更好地认识触发词,每个句子首先以基于UMLS联合模拟等级知识图的句子图为基础。基础图形随后由GEANet(GEANet)传播,GEANet是增强判断复杂事件能力的新型图形神经网络。关于BioNLP 2011 GENIA事件提取任务,我们的方法在所有事件和复杂事件上分别实现了1.41%的F1和3.19%的F1改进。吸收研究证实了GEANet和等级KG的重要性。