Event Extraction (EE) is one of the fundamental tasks in Information Extraction (IE) that aims to recognize event mentions and their arguments (i.e., participants) from text. Due to its importance, extensive methods and resources have been developed for Event Extraction. However, one limitation of current research for EE involves the under-exploration for non-English languages in which the lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance. To address this limitation, we propose a novel Multilingual Event Extraction dataset (MEE) that provides annotation for more than 50K event mentions in 8 typologically different languages. MEE comprehensively annotates data for entity mentions, event triggers and event arguments. We conduct extensive experiments on the proposed dataset to reveal challenges and opportunities for multilingual EE.
翻译:事件提取(EE)是信息提取(IE)中的一项基本任务,目的是承认文中提及的事件及其论点(即参与者),由于它的重要性,为事件提取开发了广泛的方法和资源,但是,目前EE研究的一个局限性是,对非英语语言的探索不足,缺乏高质量的多语文EE数据集作为示范培训和评估的主要障碍。为了消除这一局限性,我们提议建立一个新的多语言事件提取数据集(MEE),用8种不同类型语言为50K事件提供注释,为实体提及、事件触发和事件争论提供全面的注释数据。我们就拟议的数据集进行了广泛的实验,以揭示多语言 EE的挑战和机遇。