Existing work in counterfactual Learning to Rank (LTR) has focussed on optimizing feature-based models that predict the optimal ranking based on document features. LTR methods based on bandit algorithms often optimize tabular models that memorize the optimal ranking per query. These types of model have their own advantages and disadvantages. Feature-based models provide very robust performance across many queries, including those previously unseen, however, the available features often limit the rankings the model can predict. In contrast, tabular models can converge on any possible ranking through memorization. However, memorization is extremely prone to noise, which makes tabular models reliable only when large numbers of user interactions are available. Can we develop a robust counterfactual LTR method that pursues memorization-based optimization whenever it is safe to do? We introduce the Generalization and Specialization (GENSPEC) algorithm, a robust feature-based counterfactual LTR method that pursues per-query memorization when it is safe to do so. GENSPEC optimizes a single feature-based model for generalization: robust performance across all queries, and many tabular models for specialization: each optimized for high performance on a single query. GENSPEC uses novel relative high-confidence bounds to choose which model to deploy per query. By doing so, GENSPEC enjoys the high performance of successfully specialized tabular models with the robustness of a generalized feature-based model. Our results show that GENSPEC leads to optimal performance on queries with sufficient click data, while having robust behavior on queries with little or noisy data.
翻译:反事实学习排名(LTR)的现有工作侧重于优化基于特征的模型,预测基于文件特征的最佳排名。基于强盗算法的LTR方法往往优化每个查询中最优化排序的表格式模型。这些类型的模型有其自身的优点和缺点。基于特征的模型在许多查询中提供了非常强的性能,包括以前未见的查询,但现有特征往往限制了模型可以预测的等级。相反,表格模型可以通过记忆化方式在任何可能的排名上趋同。然而,记忆化极易受到噪音的影响,只有有大量用户互动才使列表模型可靠。我们能否开发一种强有力的反事实LTR方法,在安全的情况下采用以记忆为基础的最佳排序。我们引入了通用和特殊化(GENSPEC)的算法,一种基于特征的反事实性LTR方法,在安全时,可以采用每次每次每次按克计算的方法,即每次按每次每次按克数计算,均匀的计算。GENEC的精确度的精确度,然后用许多表格模型,然后以最精确的精确的精确性查询方式,然后用一次的G-G-CRELCR的精确性数据,然后以最优化的进度进行一次的进度显示。