NeuricNDCG:通过可区别的分类放松直接优化排序计量器 (NeuralNDCG: Direct Optimisation of a Ranking Metric via Differentiable Relaxation of Sorting)

Learning to Rank (LTR) algorithms are usually evaluated using Information Retrieval metrics like Normalised Discounted Cumulative Gain (NDCG) or Mean Average Precision. As these metrics rely on sorting predicted items' scores (and thus, on items' ranks), their derivatives are either undefined or zero everywhere. This makes them unsuitable for gradient-based optimisation, which is the usual method of learning appropriate scoring functions. Commonly used LTR loss functions are only loosely related to the evaluation metrics, causing a mismatch between the optimisation objective and the evaluation criterion. In this paper, we address this mismatch by proposing NeuralNDCG, a novel differentiable approximation to NDCG. Since NDCG relies on the non-differentiable sorting operator, we obtain NeuralNDCG by relaxing that operator using NeuralSort, a differentiable approximation of sorting. As a result, we obtain a new ranking loss function which is an arbitrarily accurate approximation to the evaluation metric, thus closing the gap between the training and the evaluation of LTR models. We introduce two variants of the proposed loss function. Finally, the empirical evaluation shows that our proposed method outperforms previous work aimed at direct optimisation of NDCG and is competitive with the state-of-the-art methods.

翻译：学习排名( LTR) 算法通常使用信息检索标准( Inform Recredition Recrimination ) 或平均平均精确度等标准进行评估。由于这些衡量标准依赖于对预测项目分数的分数(以及因此对项目等级的分数)进行分类,因此它们的衍生物不是没有定义,就是无处不在。这使得它们不适合基于梯度的优化,这是学习适当评分函数的通常方法。通常使用的LTR损失函数与评价标准的关系不大,造成优化目标与评价标准之间的不匹配。在本文中,我们通过提出NeuralNDCG这一与NDCG的新的可区别近似。由于NDCG依赖非差别化的分数,因此我们通过使用Nuralorstort来放松操作员的优化而获得NuralindCG。因此,我们获得了一个新的损失排序功能是任意准确的近似评价标准,从而缩小了培训与LperTR模型评价之间的差距。我们采用了两种不同的变式,即与NDC的拟议直接选择方法。最后,我们采用了先前的测试方法。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

专知会员服务

92+阅读 · 2020年5月5日