We present a detailed study of estimation errors in terms of surrogate loss estimation errors. We refer to such guarantees as $\mathscr{H}$-consistency estimation error bounds, since they account for the hypothesis set $\mathscr{H}$ adopted. These guarantees are significantly stronger than $\mathscr{H}$-calibration or $\mathscr{H}$-consistency. They are also more informative than similar excess error bounds derived in the literature, when $\mathscr{H}$ is the family of all measurable functions. We prove general theorems providing such guarantees, for both the distribution-dependent and distribution-independent settings. We show that our bounds are tight, modulo a convexity assumption. We also show that previous excess error bounds can be recovered as special cases of our general results. We then present a series of explicit bounds in the case of the zero-one loss, with multiple choices of the surrogate loss and for both the family of linear functions and neural networks with one hidden-layer. We further prove more favorable distribution-dependent guarantees in that case. We also present a series of explicit bounds in the case of the adversarial loss, with surrogate losses based on the supremum of the $\rho$-margin, hinge or sigmoid loss and for the same two general hypothesis sets. Here too, we prove several enhancements of these guarantees under natural distributional assumptions. Finally, we report the results of simulations illustrating our bounds and their tightness.
翻译:我们详细研究了代用损失估计错误的估计错误。 我们指的是 $\ mathscr{H} 美元 的一致估计错误界限等保证, 因为这些保证说明采用的假设值为$mathscr{H} 美元。 这些保证比 $mathscr{H} 美元校正或$\ mathcr{H} 美元 的一致性要强得多。 这些保证比文献中出现的类似过错界限要多得多。 当 $\ mathscr{H} 美元是所有可测量功能的家族时, 我们指的是提供这种保证的通用理论值, 因为这些保证值是依赖分配和依赖分配的假设值。 我们还表明, 以前的过错界限可以作为我们总体结果的特殊案例得到恢复。 然后, 在零一损失的情况下, 我们提出了一系列明确的界限, 以及线性功能和内线性网络的组合, 证明它们以一个隐藏的和内值为主的精确性保证值。