Recommender systems (RSs) have been the most important technology for increasing the business in Taobao, the largest online consumer-to-consumer (C2C) platform in China. The billion-scale data in Taobao creates three major challenges to Taobao's RS: scalability, sparsity and cold start. In this paper, we present our technical solutions to address these three challenges. The methods are based on the graph embedding framework. We first construct an item graph from users' behavior history. Each item is then represented as a vector using graph embedding. The item embeddings are employed to compute pairwise similarities between all items, which are then used in the recommendation process. To alleviate the sparsity and cold start problems, side information is incorporated into the embedding framework. We propose two aggregation methods to integrate the embeddings of items and the corresponding side information. Experimental results from offline experiments show that methods incorporating side information are superior to those that do not. Further, we describe the platform upon which the embedding methods are deployed and the workflow to process the billion-scale data in Taobao. Using online A/B test, we show that the online Click-Through-Rate (CTRs) are improved comparing to the previous recommendation methods widely used in Taobao, further demonstrating the effectiveness and feasibility of our proposed methods in Taobao's live production environment.
翻译:推荐系统(RSs)是中国最大的在线消费者对消费者在线平台道保(C2C)中增加商业规模的最重要技术。 陶保的10亿比例数据给道保的《 RS》带来了三大挑战: 缩放性、 宽度和寒冷开始。 我们在本文件中介绍了应对这三项挑战的技术解决方案。 方法以图表嵌入框架为基础。 我们首先根据用户的行为史构建一个项目图表。 然后每个项目以图嵌入方式代表为矢量。 嵌入项目被用来计算所有项目之间的对称相似性, 然后在建议过程中使用。 为了缓解松散和寒冷启动问题, 将侧面信息纳入嵌入框架。 我们提出两种组合方法, 整合项目嵌入和相应侧面信息。 离线实验结果显示, 包含侧面信息的方法优于不相容的方法。 此外, 我们描述嵌入方法的平台, 以及处理道保的10亿比例数据的工作流量, 并用于建议进程。 我们通过在线 A/B测试, 将我们先前使用过的“ 方向” 测试显示我们先前的“ 方向” 改进了在线的可行性。