Recent works have revealed an essential paradigm in designing loss functions that differentiate individual losses vs. aggregate losses. The individual loss measures the quality of the model on a sample, while the aggregate loss combines individual losses/scores over each training sample. Both have a common procedure that aggregates a set of individual values to a single numerical value. The ranking order reflects the most fundamental relation among individual values in designing losses. In addition, decomposability, in which a loss can be decomposed into an ensemble of individual terms, becomes a significant property of organizing losses/scores. This survey provides a systematic and comprehensive review of rank-based decomposable losses in machine learning. Specifically, we provide a new taxonomy of loss functions that follows the perspectives of aggregate loss and individual loss. We identify the aggregator to form such losses, which are examples of set functions. We organize the rank-based decomposable losses into eight categories. Following these categories, we review the literature on rank-based aggregate losses and rank-based individual losses. We describe general formulas for these losses and connect them with existing research topics. We also suggest future research directions spanning unexplored, remaining, and emerging issues in rank-based decomposable losses.
翻译:近期的研究揭示了设计损失函数时区分个体损失和聚合损失的重要范式。个体损失反映了模型在样本上的质量,而聚合损失将每个训练样本的个体损失/评分组合在一起。两种方法具有一个共同的流程,将一组个体值聚合成一个单一的数值。排名顺序反映了在设计损失函数中个体值之间最基本的关系。此外,可分解性是组织损失/评分的重要特性,其中损失可以分解成一组个体项的集合。本调研对机器学习中的排名分解损失函数进行了系统和全面的评估。具体而言,我们提供了一种新的损失函数分类法,遵循聚合损失和个体损失的视角。我们识别了聚合器以形成这些损失函数,这些聚合器是集合函数的示例。我们将排名分解损失函数组织成八个类别,按照这些类别,我们回顾了排名聚合损失和排名个体损失的文献。我们描述了这些损失的一般公式,并将它们与现有的研究主题联系起来。我们还提出了未被开发、遗留和新兴问题的未来研究方向,这些问题涉及排名分解损失函数。