We introduce a novel bias-variance decomposition for a range of strictly convex margin losses, including the logistic loss (minimized by the classic LogitBoost algorithm), as well as the squared margin loss and canonical boosting loss. Furthermore, we show that, for all strictly convex margin losses, the expected risk decomposes into the risk of a "central" model and a term quantifying variation in the functional margin with respect to variations in the training data. These decompositions provide a diagnostic tool for practitioners to understand model overfitting/underfitting, and have implications for additive ensemble models -- for example, when our bias-variance decomposition holds, there is a corresponding "ambiguity" decomposition, which can be used to quantify model diversity.
翻译:我们引入了一种新颖的偏差分解法,以弥补一系列严格的分流差损失,包括后勤损失(由经典的LogitBoost算法最小化),以及平差损失和罐头助推损失。此外,我们证明,对于所有严格意义上的分流差损失,预期风险分解会形成“中央”模式的风险,并用一个术语量化与培训数据变化有关的功能差差异。这些分解为从业人员提供了一种诊断工具,以了解模型的装配/损坏,并对添加共性模型产生影响,例如,当我们的偏差变异分解形成时,就会出现相应的“矛盾性”分解,可用于量化模型多样性。