This paper explores connections between margin-based loss functions and consistency in binary classification and regression applications. It is shown that a large class of margin-based loss functions for binary classification/regression result in estimating scores equivalent to log-likelihood scores weighted by an even function. A simple characterization for conformable (consistent) loss functions is given, which allows for straightforward comparison of different losses, including exponential loss, logistic loss, and others. The characterization is used to construct a new Huber-type loss function for the logistic model. A simple relation between the margin and standardized logistic regression residuals is derived, demonstrating that all margin-based loss can be viewed as loss functions of squared standardized logistic regression residuals. The relation provides new, straightforward interpretations for exponential and logistic loss, and aids in understanding why exponential loss is sensitive to outliers. In particular, it is shown that minimizing empirical exponential loss is equivalent to minimizing the sum of squared standardized logistic regression residuals. The relation also provides new insight into the AdaBoost algorithm.
翻译:本文探讨了基于差值的损失功能与二进制分类和回归应用的一致性之间的联系,表明基于二进制分类/回归应用的大量基于差值的损失功能导致估算的分数相当于以偶数函数加权的日志相似值计分数。对符合(一致)损失功能作了简单描述,从而可以直接比较各种损失,包括指数损失、后勤损失和其他损失。定性用于为后勤模式建立一个新的赫伯型损失功能。计算出差值与标准化后勤回归残留物之间的简单关系,表明所有基于差值的损失都可视为平式标准化后勤回归残余物的损失功能。这种关系为指数值和后勤损失提供了新的直接解释,有助于理解指数损失为何对外围物具有敏感性。特别是,经验表明,尽量减少经验性指数损失等于最大限度地减少标准化后勤回归残余物的总和。这种关系还为AdaBoost算法提供了新的洞察力。