Energy-efficient deep neural network (DNN) accelerators are prone to non-idealities that degrade DNN performance at inference time. To mitigate such degradation, existing methods typically add perturbations to the DNN weights during training to simulate inference on noisy hardware. However, this often requires knowledge about the target hardware and leads to a trade-off between DNN performance and robustness, decreasing the former to increase the latter. In this work, we show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy hardware at inference time without relying on any assumptions about the target hardware. In particular, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier minimization (SAMSON). Our approach outperforms existing sharpness-aware training methods both in terms of model generalization performance in noiseless regimes and robustness in noisy settings, as measured on several architectures and datasets.
翻译:能源高效的深度神经网络 (DNN) 加速器容易受到非理想因素的影响,在推理时降低 DNN 性能。为了缓解这种降级,现有的方法通常在训练过程中向 DNN 权重添加扰动,以模拟噪声硬件上的推理。然而,这往往需要关于目标硬件的知识,并导致性能和鲁棒性之间的权衡,降低前者以提高后者。在这项工作中,我们展示了应用锐度感知训练的优势,通过优化损失值和损失锐度来显著提高推理时对噪声硬件的鲁棒性,而不依赖于任何关于目标硬件的假设。特别地,我们提出了一种新的自适应锐度感知方法,它基于权重分布范围来调节给定权重的最坏情况扰动,而不仅仅是它的大小。这通过进行基于异常值归一化的锐化感知最小化 (SAMSON) 来实现。我们的方法在多个架构和数据集上均优于现有锐化感知训练方法,无噪声环境下优于模型的泛化性能,噪声环境下更加鲁棒。