This paper presents the first study on using large-scale pre-trained language models for automated generation of an event-level temporal graph for a document. Despite the huge success of neural pre-training methods in NLP tasks, its potential for temporal reasoning over event graphs has not been sufficiently explored. Part of the reason is the difficulty in obtaining large training corpora with human-annotated events and temporal links. We address this challenge by using existing IE/NLP tools to automatically generate a large quantity (89,000) of system-produced document-graph pairs, and propose a novel formulation of the contextualized graph generation problem as a sequence-to-sequence mapping task. These strategies enable us to leverage and fine-tune pre-trained language models on the system-induced training data for the graph generation task. Our experiments show that our approach is highly effective in generating structurally and semantically valid graphs. Further, evaluation on a challenging hand-labeled, out-domain corpus shows that our method outperforms the closest existing method by a large margin on several metrics. Code and pre-trained models are available at https://github.com/madaan/temporal-graph-gen.
翻译:本文介绍了关于使用大规模预先培训语言模型自动生成一个文件的事件级别时间图的第一份研究报告。尽管神经学预培训方法在NLP任务中取得了巨大成功,但是,它对于事件图表的时间推理潜力尚未得到充分探讨,部分原因是难以获得大型培训公司,以人文附加说明事件和时间链接的方式获得大量培训。我们通过利用现有的 IE/NLP 工具自动生成大量系统制作的文件配对(89,000)来应对这一挑战,并提议对背景化图形生成问题进行新颖的配方,作为顺序到序列的绘图任务。这些战略使我们能够利用和微微调预培训语言模型来生成图形任务。我们的实验表明,我们的方法在生成结构上和语义上有效的图表方面非常有效。此外,对具有挑战性的手标外表的评估表明,我们的方法在多个计量标准上大大超越了现有方法。代码和预培训模型可在 https://gientraphora.com/tredustration上查到。