Bias and variance (B&V) decomposition is frequently used as a tool for analysing classification performance. However, the standard B&V terminologies were originally defined for the regression setting and their extensions to classification has led to several different models / definitions in the literature. Although the relation between some of these models has previously been explored, their links to the standard terminology in terms of the Bayesian statistics has not been established. In this paper, we aim to provide this missing link via employing the frameworks of Tumer & Ghosh (T&G) and James. By unifying the two approaches, we relate the classification B&V defined for the 0/1 loss to the standard B&V of the boundary distributions given for the squared error loss. The closed form relationships derived in this study provide deeper understanding of the classification performance, and their example uses on predictor design and analysis are demonstrated in two case studies.
翻译:Bias和差异(B & V)分解经常被用作分析分类性能的工具,然而,B&V标准术语最初是为回归环境界定的,其分类扩展导致文献中若干不同的模型/定义。虽然以前曾探讨过其中一些模型与巴耶斯统计中标准术语的关系,但尚未确定这些模型与巴耶斯统计中标准术语的联系。本文旨在通过使用Tumer & Ghosh(T&G)和James(James)的框架来提供这一缺失环节。通过统一这两种方法,我们把为0/1损失界定的B&V分类与为平方差错损失提供的边界分布B & V标准分类联系起来。本研究中得出的封闭形式关系加深了对分类性能的理解,其用于预测器设计和分析的实例在两个案例研究中得到了证明。