Keeping the performance of language technologies optimal as time passes is of great practical interest. We study temporal effects on model performance on downstream language tasks, establishing a nuanced terminology for such discussion and identifying factors essential to conduct a robust study. We present experiments for several tasks in English where the label correctness is not dependent on time and demonstrate the importance of distinguishing between temporal model deterioration and temporal domain adaptation for systems using pre-trained representations. We find that depending on the task, temporal model deterioration is not necessarily a concern. Temporal domain adaptation however is beneficial in all cases, with better performance for a given time period possible when the system is trained on temporally more recent data. Therefore, we also examine the efficacy of two approaches for temporal domain adaptation without human annotations on new data. Self-labeling shows consistent improvement and notably, for named entity recognition, leads to better temporal adaptation than even human annotations.
翻译:随着时间流逝,保持语言技术的最佳性能具有极大的实际意义。我们研究对下游语言任务示范性业绩的时间影响,为这种讨论制定细微的术语,并找出进行有力研究所必需的因素。我们用英文对一些任务进行实验,其中标签的正确性并不取决于时间,并表明区分使用预先培训的表述的系统的时间模型变质和时间域适应的重要性。我们发现,根据任务,时间模型变质不一定引起关注。但时间域的适应在所有情况下都是有益的,在系统接受时间上更近的数据培训时,在一定的时间内可能实现更好的性能。因此,我们还审查了两种时间域适应方法的功效,而没有关于新数据的人文说明。自贴标签显示不断的改进,特别是对于被命名的实体的识别,其时间适应性甚至比人类的注释都好。