Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts. Evaluation of the approaches, however, has been limited in a number of dimensions. In particular, the precision/recall/F1 scores typically reported provide few insights on the range of errors the models make. We build on the work of Kummerfeld and Klein (2013) to propose a transformation-based framework for automating error analysis in document-level event and (N-ary) relation extraction. We employ our framework to compare two state-of-the-art document-level template-filling approaches on datasets from three domains; and then, to gauge progress in IE since its inception 30 years ago, vs. four systems from the MUC-4 (1992) evaluation.
翻译:最近开始认真重新审视文件一级信息提取任务,采用在判决一级信息提取对口单位成功的端到端神经网络技术。然而,对方法的评价在若干方面是有限的,特别是,通常报告的精确/召回/F1评分很少能洞察到模型造成的差错的范围。我们以库默菲尔德和克莱因(2013年)的工作为基础,提出了一个基于转型的框架,用于在文件级活动和(N-ary)关系提取中进行自动错误分析。我们利用我们的框架,比较三个领域在数据集方面最先进的两个文件级模板填充方法;然后,衡量自30年前启动以来在信息采集方面取得的进展,即衡量MUC-4(1992年)评价的四个系统。