E-commerce platforms usually display a mixed list of ads and organic items in feed. One key problem is to allocate the limited slots in the feed to maximize the overall revenue as well as improve user experience, which requires a good model for user preference. Instead of modeling the influence of individual items on user behaviors, the arrangement signal models the influence of the arrangement of items and may lead to a better allocation strategy. However, most of previous strategies fail to model such a signal and therefore result in suboptimal performance. To this end, we propose Cross Deep Q Network (Cross DQN) to extract the arrangement signal by crossing the embeddings of different items and processing the crossed sequence in the feed. Our model results in higher revenue and better user experience than state-of-the-art baselines in offline experiments. Moreover, our model demonstrates a significant improvement in the online A/B test and has been fully deployed on Meituan feed to serve more than 300 millions of customers.
翻译:电子商务平台通常显示不同广告和供货有机物品的混合清单。 一个关键问题是分配进料中有限的空档,以尽量扩大总收入,并改进用户经验,这需要一种良好的用户偏好模式。 安排模式不是模拟单个项目对用户行为的影响,而是模拟项目安排的影响,并可能导致更好的分配战略。然而,以往的大多数战略都未能模拟这种信号,因而造成不理想的性能。 为此,我们提议跨过不同项目的嵌入并处理进料中的跨顺序,以提取安排信号。我们的模型结果是收入高于离线实验中最先进的基线。此外,我们的模型显示在线A/B测试有显著改进,并完全安装在Meituan的种子上,为3亿多客户服务。