Joint extraction of entities and relations from unstructured texts is a crucial task in information extraction. Recent methods achieve considerable performance but still suffer from some inherent limitations, such as redundancy of relation prediction, poor generalization of span-based extraction and inefficiency. In this paper, we decompose this task into three subtasks, Relation Judgement, Entity Extraction and Subject-object Alignment from a novel perspective and then propose a joint relational triple extraction framework based on Potential Relation and Global Correspondence (PRGC). Specifically, we design a component to predict potential relations, which constrains the following entity extraction to the predicted relation subset rather than all relations; then a relation-specific sequence tagging component is applied to handle the overlapping problem between subjects and objects; finally, a global correspondence component is designed to align the subject and object into a triple with low-complexity. Extensive experiments show that PRGC achieves state-of-the-art performance on public benchmarks with higher efficiency and delivers consistent performance gain on complex scenarios of overlapping triples.
翻译:从未结构化文本中联合抽取实体和关系是信息提取的一项关键任务。最近的方法取得了相当大的成绩,但仍受到一些固有的限制,如关系预测的冗余、跨基抽取和低效率等。在本文件中,我们从新的角度将这项任务分解成三个子任务、关系判断、实体抽取和主题目标对齐,然后根据潜在关系和全球对应关系提出一个联合关系三重抽取框架。具体地说,我们设计了一个组成部分来预测潜在关系,这限制了以下实体的抽取与预测关系子子的关系,而不是所有关系;然后应用一个特定关系序列标记部分来处理主题和对象之间的重叠问题;最后,一个全球通信部分旨在将主题和对象与低兼容性结合起来。广泛的实验表明,公共基准在效率更高的公共基准上取得最先进的业绩,并在重叠的三重假设中取得一致的业绩收益。