Language use differs between domains and even within a domain, language use changes over time. For pre-trained language models like BERT, domain adaptation through continued pre-training has been shown to improve performance on in-domain downstream tasks. In this article, we investigate whether temporal adaptation can bring additional benefits. For this purpose, we introduce a corpus of social media comments sampled over three years. It contains unlabelled data for adaptation and evaluation on an upstream masked language modelling task as well as labelled data for fine-tuning and evaluation on a downstream document classification task. We find that temporality matters for both tasks: temporal adaptation improves upstream and temporal fine-tuning downstream task performance. Time-specific models generally perform better on past than on future test sets, which matches evidence on the bursty usage of topical words. However, adapting BERT to time and domain does not improve performance on the downstream task over only adapting to domain. Token-level analysis shows that temporal adaptation captures event-driven changes in language use in the downstream task, but not those changes that are actually relevant to task performance. Based on our findings, we discuss when temporal adaptation may be more effective.
翻译:语言的使用在不同领域之间,甚至在一个域内,语言使用随时间而变化。对于BERT等经过预先培训的语文模型,通过持续培训前的适应表明,通过继续培训进行的领域适应可以提高在下游任务领域的业绩。在本条中,我们调查时间适应是否会带来额外的好处。为此,我们引入了一组在三年内抽样的社交媒体评论。它包含关于上游遮掩语言模拟任务的适应和评价的无标签数据,以及关于下游文件分类任务的微调和评估的标签数据。我们发现,这两项任务的时间性问题:时间适应改善了下游任务的上游和时间微调性能。具体时间性能模型通常比未来测试组运行得更好,这些测试组与热点词的大量使用证据相匹配。然而,对时间和领域进行调整并不能提高下游任务中仅适应域的绩效。当下游任务中的时间性能分析显示,时间适应能捕捉到由事件引起的变化,但与任务性能实际相关的变化并不重要。根据我们的研究结果,我们讨论时间适应可能更加有效的时候。