Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual TEE. XLTime works on top of pre-trained language models and leverages multi-task learning to prompt cross-language knowledge transfer both from English and within the non-English languages. XLTime alleviates problems caused by a shortage of data in the target language. We apply XLTime with different language models and show that it outperforms the previous automatic SOTA methods on French, Spanish, Portuguese, and Basque, by large margins. XLTime also closes the gap considerably on the handcrafted HeidelTime method.
翻译:时间表达提取(TEE)对于理解自然语言的时间至关重要,它适用于自然语言处理(NLP)任务,如问答、信息检索和因果推断等。迄今为止,这一领域的工作主要侧重于英语,因为其他语言的标签数据很少。我们提议XLTime,这是多语言TEE的新框架。XLTime在培训前语言模式之外进行工作,并利用多任务学习促进英语和非英语内部的跨语言知识转让。XLTime缓解了目标语言数据短缺造成的问题。我们应用了不同语言模式的XLTime,并显示它大大超越了以前法语、西班牙语、葡萄牙语和巴斯克语的SOTA自动方法。XLTime还大大缩小了手工艺的HedelTime方法上的差距。