搜索系统中高效和有效的预排序框架！ (Both Efficiency and Effectiveness! A Large Scale Pre-ranking Framework in Search System)

In the realm of search systems, multi-stage cascade architecture is a prevalent method, typically consisting of sequential modules such as matching, pre-ranking, and ranking. It is generally acknowledged that the model used in the pre-ranking stage must strike a balance between efficacy and efficiency. Thus, the most commonly employed architecture is the representation-focused vector product based model. However, this architecture lacks effective interaction between the query and document, resulting in a reduction in the effectiveness of the search system. To address this issue, we present a novel pre-ranking framework called RankDFM. Our framework leverages DeepFM as the backbone and employs a pairwise training paradigm to learn the ranking of videos under a query. The capability of RankDFM to cross features provides significant improvement in offline and online A/B testing performance. Furthermore, we introduce a learnable feature selection scheme to optimize the model and reduce the time required for online inference, equivalent to a tree model. Currently, RankDFM has been deployed in the search system of a shortvideo App, providing daily services to hundreds of millions users.

翻译：在搜索系统领域，多阶段级联架构是一种流行的方法，通常由匹配、预排序和排序等顺序模块组成。人们普遍认为，预排序阶段使用的模型必须在功效和效率之间取得平衡。因此，最常用的架构是基于表示的向量积模型。然而，这种架构缺乏查询和文档之间的有效交互，从而降低了搜索系统的效果。为了解决这个问题，我们提出了一种名为RankDFM的新型预排序框架。我们的框架以DeepFM作为骨干，并采用一对一训练方法来学习查询下视频的排序。RankDFM跨特征的能力提高了离线和在线A/B测试性能。此外，我们引入了一种可学习的特征选择方案，以优化模型并减少在线推理所需的时间，相当于树模型。目前，RankDFM已经部署在一个短视频应用程序的搜索系统中，为数亿用户提供每天的服务。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【CIKM2021】用户行为序列对比学习的上下文感知文档排序

专知会员服务

20+阅读 · 2021年8月30日

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日