用户建模和建议的序列行为转移参数 (Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation)

Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists. In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Fine-tuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters. Codes and datasets are available at https://github.com/fajieyuan/sigir2020_peterrec.

翻译：感官传输学习对计算机视野和NLP领域产生了重大影响,但在推荐人系统领域尚未使用。尽管在根据模拟用户-项目互动序列生成建议方面进行了大量研究,但很少有人试图在只有有限数据的情况下为下游任务代表并转让这些模型。在本文件中,我们探讨了有效学习单一用户代表的任务,该代表可应用于从跨领域建议到用户配置预测等多种任务。精细调整大型的预先培训的参数并将其适应下游任务是解决此类任务的有效方法。然而,微调是低效的参数,因为整个模型需要为每一项新任务重新培训。为了克服这一问题,我们开发了一个称为PeterRec 的参数高效转移学习结构,这个结构可以直接配置到各种下游任务。具体地说,PeterRec允许经过预先培训的参数在微调期间保持未变的完整参数,通过注入一系列再分析的神经网络,这些网络虽然小,但表现得更好,但表现为学习整个网络的相对性能。我们进行广泛的实验性能转移数据,我们学习了多层次任务。