Industrial recommender systems have been growing increasingly complex, may involve \emph{diverse domains} such as e-commerce products and user-generated contents, and can comprise \emph{a myriad of tasks} such as retrieval, ranking, explanation generation, and even AI-assisted content production. The mainstream approach so far is to develop individual algorithms for each domain and each task. In this paper, we explore the possibility of developing a unified foundation model to support \emph{open-ended domains and tasks} in an industrial recommender system, which may reduce the demand on downstream settings' data and can minimize the carbon footprint by avoiding training a separate model from scratch for every task. Deriving a unified foundation is challenging due to (i) the potentially unlimited set of downstream domains and tasks, and (ii) the real-world systems' emphasis on computational efficiency. We thus build our foundation upon M6, an existing large-scale industrial pretrained language model similar to GPT-3 and T5, and leverage M6's pretrained ability for sample-efficient downstream adaptation, by representing user behavior data as plain texts and converting the tasks to either language understanding or generation. To deal with a tight hardware budget, we propose an improved version of prompt tuning that outperforms fine-tuning with negligible 1\% task-specific parameters, and employ techniques such as late interaction, early exiting, parameter sharing, and pruning to further reduce the inference time and the model size. We demonstrate the foundation model's versatility on a wide range of tasks such as retrieval, ranking, zero-shot recommendation, explanation generation, personalized content creation, and conversational recommendation, and manage to deploy it on both cloud servers and mobile devices.
翻译:工业推荐系统日益复杂, 可能涉及电子商业产品和用户生成的内容等统一的基础模型, 包括电子商业产品和用户生成的内容, 并可以包含\ emph{ a 繁多的任务}, 例如检索、 排名、 解释生成, 甚至AI 协助的内容制作。 目前的主流方法是为每个域和每项任务开发单个算法。 在本文件中, 我们探讨是否有可能在工业推荐系统中开发一个统一的基础模型, 以支持\ emph{ 开放式域和任务 。 这可能会减少对下游环境数据的需求, 并通过避免为每项任务训练一个从抓起的单独模型来最大限度地减少碳足迹。 创建统一的基础具有挑战性的原因是 (一) 下游域和任务可能不受限制的数据集和任务, 以及 (二) 现实世界系统强调计算效率。 因此, 我们以M6为基础, 一个类似于 GPT-3 和 T5 模式的大规模预先培训的语言模型, 利用M6 预选的下游适应能力, 通过将用户行为数据作为简单的文本, 将任务转换成时间定义, 和 快速的生成, 。