Nonparametric estimation of a statistic, in general, and of the error rate of a classification rule, in particular, from just one available dataset through resampling is well mathematically founded in the literature using several versions of bootstrap and influence function. This article first provides a concise review of this literature to establish the theoretical framework that we use to construct, in a single coherent framework, nonparametric estimators of the AUC (a two-sample statistic) other than the error rate (a one-sample statistic). In addition, the smoothness of some of these estimators is well investigated and explained. Our experiments show that the behavior of the designed AUC estimators confirms the findings of the literature for the behavior of error rate estimators in many aspects including: the weak correlation between the bootstrap-based estimators and the true conditional AUC; and the comparable accuracy of the different versions of the bootstrap estimators in terms of the RMS with little superiority of the .632+ bootstrap estimator.
翻译:对一般统计数字和分类规则误差率的非对称估计,特别是仅从一个现有数据集中得出的非对称估计,通过重新抽样,在文献中,使用几个版本的靴子陷阱和影响功能,在数学上很有根据。本文章首先简要回顾这一文献,以建立一个理论框架,我们用来在一个统一的框架内构建AUC的非对称估计器(双抽样统计),而不是误差率(一模一样统计)。此外,对其中一些估计器的顺利性进行了很好的调查和解释。我们的实验表明,设计AUC估计器的行为证实了关于误差估计器行为在许多方面的文献结论,包括:以靴子陷阱为基础的估计器与真正有条件的AUC之间薄弱的相互关系;以及靴子陷阱估计器不同版本的相对准确性,其RMS值小于632+靴杆测算器的优势。