Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty functions as part of the learning objective, alongside a standard classification loss, with a hyper-parameter controlling the relative contribution of each term. Nevertheless, these methods share two major drawbacks: 1) the scalar balancing weight is the same for all classes, hindering the ability to address different intrinsic difficulties or imbalance among classes; and 2) the balancing weight is usually fixed without an adaptive strategy, which may prevent from reaching the best compromise between accuracy and calibration, and requires hyper-parameter search for each application. We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks, which allows to learn class-wise multipliers during training, yielding a powerful alternative to common label smoothing penalties. Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization, but we introduce several modifications to tailor it for large-scale, class-adaptive training. Comprehensive evaluation and multiple comparisons on a variety of benchmarks, including standard and long-tailed image classification, semantic segmentation, and text classification, demonstrate the superiority of the proposed method. The code is available at https://github.com/by-liu/CALS.
翻译:深度神经网络在训练过程中,除了传统的准确率外,还需要考虑标定度。为了解决学习过程中的标定不准问题,一些方法探索了不同的惩罚函数作为学习目标的一部分,与标准的分类损失一起使用,其中有一个超参数来控制每个术语的相对贡献。然而,这些方法存在两个主要缺点:1)标定权重对所有类别都是相同的,阻碍了处理不同内在难度或类别不平衡的能力;2)平衡权重通常是固定的,缺乏自适应策略,这可能阻止达到最佳平衡准确度和标定度的平衡,并需要在每个应用程序中进行超参数搜索。我们提出了适合不同类别的适应性标签平滑(CALS)来对神经网络进行校准,在训练期间允许学习类别权重的乘数,为常见的标签平滑惩罚提供了有力的替代方法。我们的方法建立在增广拉格朗日方法的基础上,这是约束优化中一种成熟的技术,但我们进行了几个修改,以适应大规模的、适应性的训练。广泛的评估和多个基准比较,包括标准和长尾图像分类、语义分割和文本分类,展示了所提出方法的优越性。代码可在 https://github.com/by-liu/CALS 上找到。