E-commerce platforms usually display a mixed list of ads and organic items in feed. One key problem is to allocate the limited slots in the feed to maximize the overall revenue as well as improve user experience, which requires a good model for user preference. Instead of modeling the influence of individual items on user behaviors, the arrangement signal models the influence of the arrangement of items and may lead to a better allocation strategy. However, most of previous strategies fail to model such a signal and therefore result in suboptimal performance. In addition, the percentage of ads exposed (PAE) is an important indicator in ads allocation. Excessive PAE hurts user experience while too low PAE reduces platform revenue. Therefore, how to constrain the PAE within a certain range while keeping personalized recommendation under the PAE constraint is a challenge. In this paper, we propose Cross Deep Q Network (Cross DQN) to extract the crucial arrangement signal by crossing the embeddings of different items and modeling the crossed sequence by multi-channel attention. Besides, we propose an auxiliary loss for batch-level constraint on PAE to tackle the above-mentioned challenge. Our model results in higher revenue and better user experience than state-of-the-art baselines in offline experiments. Moreover, our model demonstrates a significant improvement in the online A/B test and has been fully deployed on Meituan feed to serve more than 300 millions of customers.
翻译:电子商务平台通常显示不同广告和有机产品供料的混合清单。 一个关键问题是分配有限的供料空档,以最大限度地增加总体收入,并改进用户经验,这需要一种良好的用户偏好模式。 安排模式不是模拟单个项目对用户行为的影响,而是表明项目安排的影响模式,并可能导致更好的分配战略。然而,以往的大多数战略都未能模拟这种信号,从而导致业绩欠佳。 此外,所暴露的广告(PAE)的百分比是分配广告的一个重要指标。 过度的PAE伤害用户经验,而PAE减少太低的平台收入。 因此,如何在一定范围内限制PAE,同时保持个人化的建议对用户行为的影响,是一项挑战。 在本文件中,我们提议Cros Seep Q 网络(Cross DQN)通过跨过不同项目的嵌嵌入和多渠道对跨序列进行建模,从而获得关键的安排信号。 此外,我们提议对PAEE的批量限制造成附带损失,从而减少PAE的用户经验,同时减少平台收入收入。 在网上测试中,比我们所部署的更高程度的用户测试更能显示我们所部署的模型的成绩。