Recommendation systems are a core feature of social media companies with their uses including recommending organic and promoted contents. Many modern recommendation systems are split into multiple stages - candidate generation and heavy ranking - to balance computational cost against recommendation quality. We focus on the candidate generation phase of a large-scale ads recommendation problem in this paper, and present a machine learning first heterogeneous re-architecture of this stage which we term TwERC. We show that a system that combines a real-time light ranker with sourcing strategies capable of capturing additional information provides validated gains. We present two strategies. The first strategy uses a notion of similarity in the interaction graph, while the second strategy caches previous scores from the ranking stage. The graph based strategy achieves a 4.08% revenue gain and the rankscore based strategy achieves a 1.38% gain. These two strategies have biases that complement both the light ranker and one another. Finally, we describe a set of metrics that we believe are valuable as a means of understanding the complex product trade offs inherent in industrial candidate generation systems.
翻译:推荐系统是社交媒体公司的核心特征,其用途包括建议有机内容和促销内容。许多现代推荐系统被分为多个阶段,即候选人生成和高排名,以平衡计算成本与建议质量。我们注重本文中大规模广告建议问题的候选人生成阶段,并展示一个机器学习这个阶段的首个差异性重组,我们称之为TwERC。我们显示,一个将实时光级与能够获取更多信息的来源战略相结合的系统提供了有效的收益。我们介绍了两个战略。第一个战略在互动图中使用了相似性的概念,而第二个战略则从排名阶段中隐藏了先前的分数。基于图表的战略实现了4.08%的收入增益,以排名为核心的战略实现了1.38%的增益。这两个战略有偏差,既补充了光级,又补充了另一个。最后,我们描述了一套我们认为有价值的衡量标准,用以理解工业候选一代系统中固有的复杂产品贸易。</s>