Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods.
翻译:RE 试图将原始的、非结构化的文本转化为结构化知识,找出文本中发现的实体对对对之间的关系信息。RE有许多用途,例如知识图的完成、文本摘要、问答和搜索查询。RE方法的历史可以大致分为四个阶段:基于模式的RE、基于统计的RE、基于神经的RE和基于大语言的模型的RE。这项调查首先概述了RE早期的几个模范工作,突出了在背景化进展方面的局限性和缺点。接下来,我们审查流行的基准,并严格审查用于评估RE绩效的衡量标准。然后我们讨论遥远的监督,这是决定现代RE方法发展的范例。最后,我们审查RE最近侧重于定义和训练前方法的工作。