We revisit the classic problem of document-level role-filler entity extraction (REE) for template filling. We argue that sentence-level approaches are ill-suited to the task and introduce a generative transformer-based encoder-decoder framework (GRIT) that is designed to model context at the document level: it can make extraction decisions across sentence boundaries; is implicitly aware of noun phrase coreference structure, and has the capacity to respect cross-role dependencies in the template structure. We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work. We also show that our modeling choices contribute to model performance, e.g., by implicitly capturing linguistic knowledge such as recognizing coreferent entity mentions.
翻译:我们重新审视了用于模板填充的文件级角色填充实体抽取(REE)这一典型问题,认为判决级方法不适合这项任务,并引入了一个基于基因变压器编码代码框架(GRIT),旨在为文件级的背景建模:它可以作出跨句的抽取决定;隐含地意识到单词共同参照结构,并有能力尊重模板结构中的跨职能依赖性。我们评估了我们在 MOC-4 数据集上的做法,并表明我们的模型比以前的工作要好得多。 我们还表明,我们的模型选择有助于模型性能,例如,通过隐含地掌握语言知识,例如承认核心实体。