Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of a leading e commerce website, serving the main traffic of hundreds of millions of active users everyday.
翻译:点击率( CTR) 预测是电子商务搜索引擎的基本任务之一。 随着搜索变得更加个性化, 有必要从丰富的行为数据中捕捉用户的兴趣。 现有的用户行为模型算法开发了不同的关注机制, 以强调与询问有关的行为, 压制无关的行为。 尽管正在广泛研究, 这些关注仍然受到两个限制。 首先, 常规关注大多将关注领域限制在单一用户的行为上, 而这在电子商务中并不合适, 用户往往寻找与任何历史行为无关的新需求。 其次, 这些关注通常偏向频繁的行为, 因为高频率不一定表明其不合理。 为了应对这两个限制, 我们提议了一个新的关注机制, 称为 Kalman 过滤( KFATt), 将加权集中关注视为一个最高顺位( MAP) 估计。 首先, KFA 用户行为很少相关时, KFAT 使用频率上限机制来纠正经常发生的电子行为, 因为高频并不合理, 我们提议一个新的关注机制, 以上百万个日历的在线系统为基准/ KFA级别, 的直径直径直达10亿次的Seralalalalalal 测试显示所有KFA级实际生产排名。