Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty functions as part of the learning objective, alongside a standard classification loss, with a hyper-parameter controlling the relative contribution of each term. Nevertheless, these methods share two major drawbacks: 1) the scalar balancing weight is the same for all classes, hindering the ability to address different intrinsic difficulties or imbalance among classes; and 2) the balancing weight is usually fixed without an adaptive strategy, which may prevent from reaching the best compromise between accuracy and calibration, and requires hyper-parameter search for each application. We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks, which allows to learn class-wise multipliers during training, yielding a powerful alternative to common label smoothing penalties. Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization, but we introduce several modifications to tailor it for large-scale, class-adaptive training. Comprehensive evaluation and multiple comparisons on a variety of benchmarks, including standard and long-tailed image classification, semantic segmentation, and text classification, demonstrate the superiority of the proposed method. The code is available at https://github.com/by-liu/CALS.
翻译:最近的研究显示,除了常规的准确性外,还应考虑校准,以培训现代深神经网络。为了解决学习过程中的校准错误,一些方法探索了不同的惩罚功能,作为学习目标的一部分,同时探索了标准的分类损失,并有一个超参数来控制每个术语的相对贡献。然而,这些方法有两大缺点:1) 标度平衡权重对所有类别都是一样的,这妨碍了解决不同内在困难或不同类别之间不平衡的能力;2) 平衡权重通常在没有适应性战略的情况下加以确定,这可能妨碍在精确性和校准之间达成最佳的妥协,并要求对每项应用程序进行超参数搜索。我们建议对深网络进行校准,以便能够在培训期间学习高等级的乘数,从而产生一种强有力的替代通用标签平稳处罚的替代方法。我们的方法建立在普遍加分拉格兰格办法之上,这是一种完善的优化技术,但我们对平衡权重进行若干修改,以适应性地调整战略,从而可能妨碍在精确性和校准之间达成最佳的妥协,并要求对每项应用进行超分度的搜索。我们提议对各种基准进行综合评估和多次比较,其中包括标准/长期的等级分级/C级分类。