News recommendation is a widely adopted technique to provide personalized news feeds for the user. Recently, pre-trained language models (PLMs) have demonstrated the great capability of natural language understanding and benefited news recommendation via improving news modeling. However, most existing works simply finetune the PLM with the news recommendation task, which may suffer from the known domain shift problem between the pre-training corpus and downstream news texts. Moreover, PLMs usually contain a large volume of parameters and have high computational overhead, which imposes a great burden on low-latency online services. In this paper, we propose Tiny-NewsRec, which can improve both the effectiveness and the efficiency of PLM-based news recommendation. We first design a self-supervised domain-specific post-training method to better adapt the general PLM to the news domain with a contrastive matching task between news titles and news bodies. We further propose a two-stage knowledge distillation method to improve the efficiency of the large PLM-based news recommendation model while maintaining its performance. Multiple teacher models originated from different time steps of our post-training procedure are used to transfer comprehensive knowledge to the student in both its post-training and finetuning stage. Extensive experiments on two real-world datasets validate the effectiveness and efficiency of our method.
翻译:最近,经过预先培训的语文模式(PLM)展示了自然语言理解的巨大能力,并通过改进新闻模型使新闻建议受益。然而,大多数现有作品只是对PLM的新闻建议任务进行微调,这可能会受到培训前材料和下游新闻文本之间已知的域变问题的影响。此外,PLM通常包含大量参数和高计算间接费用,给低纬度在线服务带来沉重负担。在本文中,我们提议Tini-NewsRec,这可以提高基于PLM的新闻建议的效力和效率。我们首先设计了一种自我监督的特定域域别培训后方法,使一般PLM更好地适应新闻领域,而新闻标题和下游新闻文本之间的相对应任务。我们进一步提出一个两阶段的知识蒸馏方法,以提高基于PLM的大型新闻建议模式的效率,同时保持其性能。来自我们培训后程序不同时间步骤的多种教师模式,用于在后期将全面知识转让给学生,在后期测试阶段和数据效率测试的两个阶段,即将全面知识转让给学生。