Script event prediction aims to predict the subsequent event given the context. This requires the capability to infer the correlations between events. Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge~(e.g., discourse relations). Though promising results have been achieved, some challenges still remain. First, the pretrained language models adopted by current works ignore event-level knowledge, resulting in an inability to capture the correlations between events well. Second, modeling correlations between events with discourse relations is limited because it can only capture explicit correlations between events with discourse markers, and cannot capture many implicit correlations. To this end, we propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective and predicts the next event within a generative paradigm. Specifically, we first introduce a novel event-level blank infilling strategy as the learning objective to inject event-level knowledge into the pretrained language model, and then design a likelihood-based contrastive loss for fine-tuning the generative model. Instead of using an additional prediction layer, we perform prediction by using sequence likelihoods generated by the generative model. Our approach models correlations between events in a soft way without any external knowledge. The likelihood-based prediction eliminates the need to use additional networks to make predictions and is somewhat interpretable since it scores each word in the event. Experimental results on the multi-choice narrative cloze~(MCNC) task demonstrate that our approach achieves better results than other state-of-the-art baselines. Our code will be available at https://github.com/zhufq00/mcnc.
翻译:脚本事件预测旨在根据上下文预测后续事件。 这要求具备判断事件之间关联的能力。 最近的工作试图通过使用预先培训的语言模型和吸收外部知识来改进事件关联推理。 尽管取得了令人乐观的成果,但仍然存在一些挑战。 首先,当前工作所采用的预先培训的语言模型忽略了事件层面的知识,导致无法很好地捕捉事件之间的关联。 其次, 模拟事件与讨论关系之间的关联是有限的, 因为它只能捕捉有讨论标记的事件之间的明确关联, 无法捕捉许多隐含的关联。 为此, 我们提议对这项任务采用新的事件关联方法, 使用预先培训的语言模型和吸收外部知识。 具体地说, 我们首先采用新的事件层面空白来填充战略, 将事件层面的知识注入预先培训的语言模型, 然后设计基于概率的基线对比损失, 以微调模型为基础, 而不是使用额外的软性预测层, 我们的实验性语言模型, 使用我们之前的变数级预估模型, 使用我们所有的变数序列生成的变数 。 我们的变数 方法 需要用我们所有的变数序列生成的变数 的变数 数据 。 通过使用我们的任何变数 的变数 的变数 方法 的变数 将产生出任何的变数的变数 。