One of the challenges in event extraction via traditional supervised learning paradigm is the need for a sizeable annotated dataset to achieve satisfactory model performance. It is even more challenging when it comes to event extraction in the finance and economics domain, a domain with considerably fewer resources. This paper presents a complete framework for extracting and processing crude oil-related events found in CrudeOilNews corpus, addressing the issue of annotation scarcity and class imbalance by leveraging on the effectiveness of transfer learning. Apart from event extraction, we place special emphasis on event properties (Polarity, Modality, and Intensity) classification to determine the factual certainty of each event. We build baseline models first by supervised learning and then exploit Transfer Learning methods to boost event extraction model performance despite the limited amount of annotated data and severe class imbalance. This is done via methods within the transfer learning framework such as Domain Adaptive Pre-training, Multi-task Learning and Sequential Transfer Learning. Based on experiment results, we are able to improve all event extraction sub-task models both in F1 and MCC1-score as compared to baseline models trained via the standard supervised learning. Accurate and holistic event extraction from crude oil news is very useful for downstream tasks such as understanding event chains and learning event-event relations, which can be used for other downstream tasks such as commodity price prediction, summarisation, etc. to support a wide range of business decision making.
翻译:在通过传统监督的学习模式进行石油采掘时,遇到的挑战之一是需要建立数量可观的附加说明的数据集,以取得令人满意的示范性业绩;在财政和经济领域,这是一个资源少得多的领域,当涉及事件提取时,更具有挑战性;本文件提供了一个完整的框架,用于提取和处理CrudeOilNews Pasporation中发现的原油相关事件,通过利用转让学习的有效性,解决批注稀缺和阶级不平衡问题。除了提取事件外,我们特别强调事件属性分类(Polity、Modality和强度),以确定每项事件的实际确定性。我们首先通过监管的学习,然后利用转让学习方法建立基线模型,以提高事件提取模型的绩效,尽管附加说明的数据数量有限,而且等级严重失衡。这是通过转让学习框架,例如Domain Retacting 培训前、多任务学习和顺序转移学习学习。根据实验结果,我们能够改进F1和MC1核心事件的所有事件提取子任务分任务,与通过监管性学习的基线模型相比,我们先建立基线模型,然后利用转移学习方法,从标准监管的下游活动学习,学习其他供应链和整体活动。