NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-$K$ variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-$K$ NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warm-up and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee.
翻译:NDCG, 即普通化贴现累积增益, 是信息检索和机器学习中广泛使用的一种等级衡量标准, 然而, 特别是在深层模型中, 仍然缺乏实现最大NDCG最大化的有效和可证实的随机方法。 在本文中, 我们提出了优化NDCG及其最高- 美元变量的原则性方法。 首先, 我们为优化 NDCG 代谢, 以及优化顶级- K$ NDCG 代谢的新型双级构成优化问题, 是一个新颖的双级构成优化问题。 然后, 我们开发高效的随机算法, 为非对等目标提供可确认的趋同保证。 不同于 NDCG 现有 NDCG 优化方法, 我们的算法的精确复杂性与小批量而不是总项目的数量不同。 为了提高深层次学习的效果, 我们进一步提出切实可行的战略, 使用初始暖化和停止梯度操作器。 多个数据集的实验结果表明, 我们的方法在NDCG 的排序方法上超越了我们以前的排序方法。 根据我们的最佳了解, 这是第一次采用最优化的保证会合法。