Extracting entities and relations is an essential task of information extraction. Triplets extracted from a sentence might overlap with each other. Previous methods either did not address the overlapping issues or solved overlapping issues partially. To tackle triplet overlapping problems completely, firstly we extract candidate subjects with a standard span mechanism. Then we present a labeled span mechanism to extract the objects and relations simultaneously, we use the labeled span mechanism to generate labeled spans whose start and end positions indicate the objects, and whose labels correspond to relations of subject and objects. Besides, we design an entity attention mechanism to enhance the information fusion between subject and sentence during extracting objects and relations. We test our method on two public datasets, our method achieves the best performances on these two datasets.
翻译:抽取实体和关系是信息抽取的基本任务。从句子中提取的三元组可能会彼此重叠。先前的方法要么没有解决重叠问题,要么只在一定程度上解决了重叠问题。为了完全解决三元组重叠问题,我们首先使用标准 Span 机制提取候选主语。接着,我们提出一种标记 Span 机制来同时提取对象和关系,我们使用标记 Span 机制生成标记的 Span,其起始和结束位置表示对象,其标签对应于主语和对象的关系。此外,我们设计了一种实体关注机制,在提取对象和关系时增强主语和句子之间的信息融合。我们在两个公共数据集上测试了我们的方法,结果显示我们的方法在这两个数据集上均取得了最佳性能。