Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We present a theoretical analysis of a broad family of losses, comp-sum losses, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other loss cross-entropy-like functions. We give the first $H$-consistency bounds for these loss functions. These are non-asymptotic guarantees that upper bound the zero-one loss estimation error in terms of the estimation error of a surrogate loss, for the specific hypothesis set $H$ used. We further show that our bounds are tight. These bounds depend on quantities called minimizability gaps, which only depend on the loss function and the hypothesis set. To make them more explicit, we give a specific analysis of these gaps for comp-sum losses. We also introduce a new family of loss functions, smooth adversarial comp-sum losses, derived from their comp-sum counterparts by adding in a related smooth term. We show that these loss functions are beneficial in the adversarial setting by proving that they admit $H$-consistency bounds. This leads to new adversarial robustness algorithms that consist of minimizing a regularized smooth adversarial comp-sum loss. While our main purpose is a theoretical analysis, we also present an extensive empirical analysis comparing comp-sum losses. We further report the results of a series of experiments demonstrating that our adversarial robustness algorithms outperform the current state-of-the-art, while also achieving a superior non-adversarial accuracy.
翻译:交叉熵是应用广泛的一种损失函数。当使用 softmax 时,它与神经网络的输出重合,并应用于逻辑回归。但是,在使用交叉熵作为代理损失时,我们可以依赖什么样的保证呢?我们对一个广泛的损失函数家族进行了理论分析,该家族包括交叉熵(或逻辑损失函数)、广义交叉熵、平均绝对误差和其他交叉熵相似的函数。我们首次给出了这些损失函数的 $H$-一致性边界。这些是一种非渐进保证,它用代理损失的估计误差上界来上界化零一损失估计误差,对于使用的特定假设集 $H$。我们证明了我们的边界是紧凑的。这些边界依赖于称为最小化间隙的量,这些量仅依赖于损失函数和假设集。为了使它们更明确,我们针对 comp-sum 损失的具体分析给出了这些间隙的分析。我们还引入了一种新的损失函数家族,平滑对抗 comp-sum 损失,通过增加相关的平滑项从 comp-sum 对应物派生而来。我们证明了这些损失函数在敌对方案中是有益的,因为它们接受 $H$-一致性边界。这导致了新的敌对健壮性算法,其由最小化正则化的平滑对抗 comp-sum 损失组成。虽然我们的主要目的是理论分析,但我们还进行了广泛的实证分析比较 comp-sum 损失。我们还报告了一系列实验的结果,证明了我们的敌对健壮性算法在实现优越的非敌对精度的同时,表现优于当前的最先进技术。