Script event prediction aims to predict the subsequent event given the context. This requires the capability to infer the correlations between events. Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge~(e.g., discourse relations). Though promising results have been achieved, some challenges still remain. First, the pretrained language models adopted by current works ignore event-level knowledge, resulting in an inability to capture the correlations between events well. Second, modeling correlations between events with discourse relations is limited because it can only capture explicit correlations between events with discourse markers, and cannot capture many implicit correlations. To this end, we propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective and predicts the next event within a generative paradigm. Specifically, we first introduce a novel event-level blank infilling strategy as the learning objective to inject event-level knowledge into the pretrained language model, and then design a likelihood-based contrastive loss for fine-tuning the generative model. Instead of using an additional prediction layer, we perform prediction by using sequence likelihoods generated by the generative model. Our approach models correlations between events in a soft way without any external knowledge. The likelihood-based prediction eliminates the need to use additional networks to make predictions and is somewhat interpretable since it scores each word in the event. Experimental results on the multi-choice narrative cloze~(MCNC) task demonstrate that our approach achieves better results than other state-of-the-art baselines. Our code will be available at \url{https://github.com/zhufq00/mcnc}.
翻译:脚本事件预测旨在根据上下文预测后续事件。 这要求具备判断事件之间关联的能力。 最近的工作试图通过使用预先培训的语言模型和吸收外部知识来改进事件关联推理。 虽然取得了令人乐观的成果, 但仍然存在一些挑战。 首先, 当前工作采用的预先培训的语言模型忽略了事件层面的知识, 导致无法很好地捕捉事件之间的关联。 其次, 模拟事件与讨论关系之间的关联是有限的, 因为它只能捕捉有讨论标记的事件之间的明确关联, 并且无法捕捉许多隐含的关联。 为此, 我们提议了一个新的事件关联化方法, 使用预先培训的语言模型和外部知识 。 我们首先引入了一个新的事件级空白, 作为学习目标, 将事件级知识引入预培训的语言模型, 然后设计一个基于概率的基线对比状态, 用于精细调整基因化模型。 而不是使用额外的软化的预测层, 我们用以事件中心为中心, 将使用我们可能实现的排序, 我们的外部预测方式进行预测, 需要用我们可能生成的轨道/ 。 任何基于我们的数据级预测方式, 需要用一个更好的方法来实现我们可能生成的路径 。 任何的外部的 。 通过使用我们可能生成的路径, 任何基于 的路径 。