Entity and relation extraction is a key task in information extraction, where the output can be used for downstream NLP tasks. Existing approaches for entity and relation extraction tasks mainly focus on the English corpora and ignore other languages. Thus, it is critical to improving performance in a multilingual setting. Meanwhile, multilingual training is usually used to boost cross-lingual performance by transferring knowledge from languages (e.g., high-resource) to other (e.g., low-resource) languages. However, language interference usually exists in multilingual tasks as the model parameters are shared among all languages. In this paper, we propose a two-stage multilingual training method and a joint model called Multilingual Entity and Relation Extraction framework (mERE) to mitigate language interference across languages. Specifically, we randomly concatenate sentences in different languages to train a Language-universal Aggregator (LA), which narrows the distance of embedding representations by obtaining the unified language representation. Then, we separate parameters to mitigate interference via tuning a Language-specific Switcher (LS), which includes several independent sub-modules to refine the language-specific feature representation. After that, to enhance the relational triple extraction, the sentence representations concatenated with the relation feature are used to recognize the entities. Extensive experimental results show that our method outperforms both the monolingual and multilingual baseline methods. Besides, we also perform detailed analysis to show that mERE is lightweight but effective on relational triple extraction and mERE{} is easy to transfer to other backbone models of multi-field tasks, which further demonstrates the effectiveness of our method.
翻译:在信息提取中,实体和关系提取是一项关键任务,其产出可用于下游国家语言定位任务。实体和关系提取任务的现有方法主要侧重于英语公司,忽视其他语言。因此,这对于提高多语种环境中的绩效至关重要。与此同时,多语种培训通常用于通过将知识从语言(例如高资源)转移到其他语言(例如低资源)来提高跨语言的绩效。然而,语言干扰通常存在于多语种任务中,因为所有语言都共享了示范参数。在本文件中,我们提出了两阶段多语种培训方法和联合模型,称为多语言实体和Relation Mistricle框架(mRE),以减少跨语言干扰。具体地说,我们随机地将不同语言的句子组合起来,以培训语言通用聚合器(LA),通过获得统一的语言代表方式缩小了嵌入代表的距离。然后,我们通过调整特定语言切换标准(LS),包括若干独立的分模块,以完善语言区分具体语言特征的配置关系。之后,我们又将快速的里程(malimal-lementalimal legrational lemental laction) 也展示了我们使用的多语种模式的模型,以展示了我们所使用的基础分析结果。