Recent studies on Click-Through Rate (CTR) prediction has reached new levels by modeling longer user behavior sequences. Among others, the two-stage methods stand out as the state-of-the-art (SOTA) solution for industrial applications. The two-stage methods first train a retrieval model to truncate the long behavior sequence beforehand and then use the truncated sequences to train a CTR model. However, the retrieval model and the CTR model are trained separately. So the retrieved subsequences in the CTR model is inaccurate, which degrades the final performance. In this paper, we propose an end-to-end paradigm to model long behavior sequences, which is able to achieve superior performance along with remarkable cost-efficiency compared to existing models. Our contribution is three-fold: First, we propose a hashing-based efficient target attention (TA) network named ETA-Net to enable end-to-end user behavior retrieval based on low-cost bit-wise operations. The proposed ETA-Net can reduce the complexity of standard TA by orders of magnitude for sequential data modeling. Second, we propose a general system architecture as one viable solution to deploy ETA-Net on industrial systems. Particularly, ETA-Net has been deployed on the recommender system of Taobao, and brought 1.8% lift on CTR and 3.1% lift on Gross Merchandise Value (GMV) compared to the SOTA two-stage methods. Third, we conduct extensive experiments on both offline datasets and online A/B test. The results verify that the proposed model outperforms existing CTR models considerably, in terms of both CTR prediction performance and online cost-efficiency. ETA-Net now serves the main traffic of Taobao, delivering services to hundreds of millions of users towards billions of items every day.
翻译:最近对Click-Trough Rations(CTR)的预测研究通过模拟更长期用户行为序列而达到了新的水平。 除其他外, 两阶段方法作为工业应用的最新工艺( SOTA) 解决方案而突出。 两阶段方法首先训练一个检索模型, 事先跳过长的行为序列, 然后用短线序列来训练CTR模型。 然而, 检索模型和 CTR 模型的快速序列已经达到了新的水平。 因此, CTR 模型中检索到的子序列不准确, 从而降低了最后性能。 在本文中, 我们提出一个端到端到端到端的模型模式, 以模拟长期行为序列的形式模拟长期行为序列, 与现有模型相比, 能够达到优异乎寻常的成本效益。 我们的贡献是三重: 首先, 我们提出一个基于端到端到端的高效率目标关注(TA)网络, 以低成本的点到端到端到端到端到端到端到端运行操作操作的操作系统。 拟议到端到端到端到端到端到端到端的C- 端到端到端到端到端的C- 端到端到端系统, 端到端到端到端到端到端到端到端系统, 端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到端到