Our world is constantly evolving, and so is the content on the web. Consequently, our languages, often said to mirror the world, are dynamic in nature. However, most current contextual language models are static and cannot adapt to changes over time. In this work, we propose a temporal contextual language model called TempoBERT, which uses time as an additional context of texts. Our technique is based on modifying texts with temporal information and performing time masking - specific masking for the supplementary time information. We leverage our approach for the tasks of semantic change detection and sentence time prediction, experimenting on diverse datasets in terms of time, size, genre, and language. Our extensive evaluation shows that both tasks benefit from exploiting time masking.
翻译:我们的世界在不断演变,网络上的内容也在不断演变。因此,我们的语言,通常说要反映世界,在性质上是动态的。然而,大多数当前背景语言模式是静态的,无法随时间变化而适应变化。在这项工作中,我们提出了一个时间背景语言模式,称为TepoBERT,使用时间作为文本的额外背景。我们的技术基于用时间信息修改文本,并用时间遮盖----为补充时间信息进行具体掩蔽。我们利用我们的方法来探测语义变化和判决时间预测,在时间、大小、类型和语言方面试验不同的数据集。我们的广泛评估表明,利用时间掩蔽对这两项任务都有好处。