Boosting methods often achieve excellent classification accuracy, but can experience notable performance degradation in the presence of label noise. Existing robust methods for boosting provide theoretical robustness guarantees for certain types of label noise, and can exhibit only moderate performance degradation. However, previous theoretical results do not account for realistic types of noise and finite training sizes, and existing robust methods can provide unsatisfactory accuracies, even without noise. This paper presents methods for robust minimax boosting (RMBoost) that minimize worst-case error probabilities and are robust to general types of label noise. In addition, we provide finite-sample performance guarantees for RMBoost with respect to the error obtained without noise and with respect to the best possible error (Bayes risk). The experimental results corroborate that RMBoost is not only resilient to label noise but can also provide strong classification accuracy.
翻译:提升方法通常能够获得优异的分类准确率,但在存在标签噪声时可能出现显著的性能下降。现有的鲁棒提升方法为特定类型的标签噪声提供了理论上的鲁棒性保证,且仅表现出适度的性能下降。然而,先前的理论结果未能考虑现实中的噪声类型和有限的训练样本量,且现有的鲁棒方法即使在无噪声情况下也可能提供不尽人意的准确率。本文提出了鲁棒极小极大提升(RMBoost)方法,该方法能够最小化最坏情况下的错误概率,并对一般类型的标签噪声具有鲁棒性。此外,我们为RMBoost提供了关于无噪声情况下所得错误率以及关于最佳可能错误率(贝叶斯风险)的有限样本性能保证。实验结果证实,RMBoost不仅对标签噪声具有鲁棒性,还能提供强大的分类准确率。