Ensembles are a straightforward, remarkably effective method for improving the accuracy,calibration, and robustness of models on classification tasks; yet, the reasons that underlie their success remain an active area of research. We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers. Introducing a dual reparameterization of the bias-variance tradeoff, we first derive generalized laws of total expectation and variance for nonsymmetric losses typical of classification tasks. Comparing conditional and bootstrap bias/variance estimates, we then show that conditional estimates necessarily incur an irreducible error. Next, we show that ensembling in dual space reduces the variance and leaves the bias unchanged, whereas standard ensembling can arbitrarily affect the bias. Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction.We conclude by an empirical analysis of recent deep learning methods that ensemble over hyperparameters, revealing that these techniques indeed favor bias reduction. This suggests that, contrary to classical wisdom, targeting bias reduction may be a promising direction for classifier ensembles.
翻译:集合是提高分类任务模型的准确性、校准性和稳健性的一个简单、非常有效的方法;然而,其成功的原因仍是一个积极的研究领域。我们以Pfau(2013)的偏差偏差分解扩展为基础,以便获得对分类者组合行为的重要洞察力。对偏差取舍实行双倍的重新校准,我们首先得出关于分类任务典型的非对称损失的总体期望和差异的普遍法则。比较有条件和靴系偏差/变差估计,我们然后表明,有条件估计必然会产生不可避免的错误。接下来,我们表明,在双重空间的混杂会减少差异,使偏差保持不变,而标准组装会任意地影响分类者的行为。 随机地说,标准组合会减少偏差,导致我们误差,因为分类者群集可能在很大程度上由于这种意外的减少。我们通过对最近深刻的学习方法进行经验分析,这些方法可能超越超超正数的偏差,揭示了这些方法的偏差会反正向性。