Showing items that do not match search query intent degrades customer experience in e-commerce. These mismatches result from counterfactual biases of the ranking algorithms toward noisy behavioral signals such as clicks and purchases in the search logs. Mitigating the problem requires a large labeled dataset, which is expensive and time-consuming to obtain. In this paper, we develop a deep, end-to-end model that learns to effectively classify mismatches and to generate hard mismatched examples to improve the classifier. We train the model end-to-end by introducing a latent variable into the cross-entropy loss that alternates between using the real and generated samples. This not only makes the classifier more robust but also boosts the overall ranking performance. Our model achieves a relative gain compared to baselines by over 26% in F-score, and over 17% in Area Under PR curve. On live search traffic, our model gains significant improvement in multiple countries.
翻译:显示与搜索查询意图不匹配的项目会降低客户在电子商务方面的经验。 这些不匹配是因为排序算法对在搜索日志中点击和购买等吵闹行为信号的反事实偏差。 缓解问题需要一个庞大的标签数据集, 该数据集成本昂贵且耗时才能获得。 在本文中, 我们开发了一个深层次的端到端模型, 该模型学会有效地分类不匹配, 并生成难以匹配的示例来改进分类器。 我们通过在使用实际样本和生成样本之间的交叉吸收损失中引入潜在变量来培训模型端到端。 这不仅使分类器更加强大,而且还提高了总体排序性能。 我们的模型比基线获得相对收益, F-score 超过 26 %, PR 曲线下区域超过 17 % 。 在现场搜索流量上, 我们的模型在多个国家取得了显著进步。