In this paper, we describe our participation in the subtask 1 of CASE-2022, Event Causality Identification with Casual News Corpus. We address the Causal Relation Identification (CRI) task by exploiting a set of simple yet complementary techniques for fine-tuning language models (LMs) on a small number of annotated examples (i.e., a few-shot configuration). We follow a prompt-based prediction approach for fine-tuning LMs in which the CRI task is treated as a masked language modeling problem (MLM). This approach allows LMs natively pre-trained on MLM problems to directly generate textual responses to CRI-specific prompts. We compare the performance of this method against ensemble techniques trained on the entire dataset. Our best-performing submission was trained only with 256 instances per class, a small portion of the entire dataset, and yet was able to obtain the second-best precision (0.82), third-best accuracy (0.82), and an F1-score (0.85) very close to what was reported by the winner team (0.86).
翻译:在本文中,我们描述了我们参与CASE-2022,事件因果识别与临时新闻Corpus的子任务1的情况。我们处理因果识别(CRI)任务时,利用了一套简单而互补的微调语言模型技术,在少数附加说明的例子(即几发配置)中进行微调LMS,我们采用了基于迅速的预测方法,在微调LMS时,将CRI的任务视为隐性语言模型问题(MLMM)。这一方法使当地在MLM问题上受过预先训练的LMS能够直接生成对CRI特定提示的文字反应。我们将这种方法的性能与整个数据集中经过训练的混合技术作比较。我们最佳的提交只接受了每类256例的培训,整个数据集的一小部分只有256例,但仍然能够获得第二精度(0.82),第三精度(0.82)和F1芯(0.85),非常接近赢家小组报告的内容(0.86)。