Event extraction is a complex information extraction task that involves extracting events from unstructured text. Prior classification-based methods require comprehensive entity annotations for joint training, while newer generation-based methods rely on heuristic templates containing oracle information such as event type, which is often unavailable in real-world scenarios. In this study, we consider a more realistic setting of this task, namely the Oracle-Free Event Extraction (OFEE) task, where only the input context is given without any oracle information, including event type, event ontology and trigger word. To solve this task, we propose a new framework, called COFFEE, which extracts the events solely based on the document context without referring to any oracle information. In particular, a contrastive selection model is introduced in COFFEE to rectify the generated triggers and handle multi-event instances. The proposed COFFEE outperforms state-of-the-art approaches under the oracle-free setting of the event extraction task, as evaluated on a public event extraction benchmark ACE05.
翻译:事件抽取是从非结构化文本中提取事件的复杂信息抽取任务。先前的基于分类的方法需要全面的实体注释以进行联合训练,而新一代基于生成式的方法依赖于包含玄学信息(例如事件类型)的启发式模板,这在实际场景中经常不可用。在本研究中,我们考虑了这个任务的一个更现实的设置,即无预言事件抽取(OFEE)任务,其中只提供输入上下文而不包含任何的预处理信息,包括事件类型,事件本体和触发词。为了解决这个任务,我们提出了一个新的框架,称为COFFEE,该框架仅基于文档上下文提取事件,而不参考任何预处理信息。特别地,COFFEE中引入了一种对比选择模型来矫正生成的触发词和处理多事件实例。所提出的COFFEE在无预言设置的事件抽取任务中优于现有的方法,在公共事件抽取基准ACE05上进行了评估。