Listwise learning-to-rank methods form a powerful class of ranking algorithms that are widely adopted in applications such as information retrieval. These algorithms learn to rank a set of items by optimizing a loss that is a function of the entire set -- as a surrogate to a typically non-differentiable ranking metric. Despite their empirical success, existing listwise methods are based on heuristics and remain theoretically ill-understood. In particular, none of the empirically successful loss functions are related to ranking metrics. In this work, we propose a cross entropy-based learning-to-rank loss function that is theoretically sound, is a convex bound on NDCG -- a popular ranking metric -- and is consistent with NDCG under learning scenarios common in information retrieval. Furthermore, empirical evaluation of an implementation of the proposed method with gradient boosting machines on benchmark learning-to-rank datasets demonstrates the superiority of our proposed formulation over existing algorithms in quality and robustness.
翻译:列表到排序方法形成了一种强大的排序算法,在信息检索等应用中被广泛采用。这些算法通过优化整个数据集的功能来学习对一组项目进行排序,以优化整个数据集的功能 -- -- 以替代典型的、非差别的排名衡量标准。尽管这些算法取得了成功,但现有的列表方法仍然基于重力学,在理论上仍然不易理解。特别是,经验上成功的损失功能没有一个与排序指标有关。在这项工作中,我们提议了一种在理论上健全的跨星盘学习到排序损失功能,这是受NDCG(一种流行的排序指标)约束的一种连接,在信息检索常见的学习情景下与NDCG(NDCG)相一致。此外,对使用梯度增强机器在基准学习到排序数据集上实施拟议方法的实证性评价表明,我们拟议的公式优于质量和稳健性的现有算法。