NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-$K$ variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-$K$ NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warm-up and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.
翻译:NDCG,即标准化的贴现累计增益,是信息检索和机器学习中广泛使用的一种评级标准,然而,在最大实现NDCG最大化方面,仍然缺乏高效和可证实的随机方法,特别是深层模型。在本文件中,我们提出了优化NDCG及其最高价值-K美元变量的原则性方法。首先,我们为优化NDCG代金制定了新的构成优化问题,为优化顶级-K$ NDCG代金代金提出了新的双级构成优化问题。然后,我们开发了高效的随机算法,为非康维克斯目标提供了可确认的趋同保证。与现有的NDCG优化方法不同,我们算法的精度复杂度与小批量相比,而不是项目总数。为了提高深层次学习的效果,我们进一步提出了切实可行的战略,使用初始热度和停止梯度操作器。多个数据集的实验结果表明,我们的方法在NDCG的排名方法上优于先前的排序方法。我们最了解的是,这是首次建议采用S-AVIC的优化方法。