Model compression via quantization and sparsity enhancement has gained an immense interest to enable the deployment of deep neural networks (DNNs) in resource-constrained edge environments. Although these techniques have shown promising results in reducing the energy, latency and memory requirements of the DNNs, their performance in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be completely understood. In this paper, we investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs). We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults. For instance, activation-sparse QDNNs exhibit up to 17.32% lower accuracy than the standard QDNNs. We also establish that one of the major cause of the degraded accuracy is sharper minima in the loss landscape for activation-sparse QDNNs, which makes them more sensitive to perturbations in the weight values due to faults. Based on this observation, we propose the mitigation of the impact of faults by employing a sharpness-aware quantization (SAQ) training scheme. The activation-sparse and standard QDNNs trained with SAQ have up to 36.71% and 24.76% higher inference accuracy, respectively compared to their conventionally trained equivalents. Moreover, we show that SAQ-trained activation-sparse QDNNs show better accuracy in faulty settings than standard QDNNs trained conventionally. Thus the proposed technique can be instrumental in achieving sparsity-related energy/latency benefits without compromising on fault tolerance.
翻译:虽然这些技术在降低DNN的能量、悬浮和记忆要求方面显示出令人乐观的结果,但它们在非理想现实世界环境中的性能(如存在硬件故障)还有待完全理解。在本文中,我们调查了点翻转和卡登差错对激活-扭曲四分化 DNN(QDNNs)的影响。我们显示,在资源紧张的边缘环境中部署深层神经网络(DNNs)的高度启动性神经网络(DNNSs)的代价是,对减少DNNNNNNNPs的能量、延迟度和记忆要求而言,这些技术在减少DNDs的能量、延迟性能要求和记忆的要求方面都取得了令人乐观的结果。我们建议,在SA标准值的丧失中,Blicklipperlip和SASA标准值的扭曲性方面,可以更敏感地理解。我们建议,在对QQQQ-D值的精确度进行更高的评估后,QQQ-Q-QQ-Q-QQQ-RE real-ass redustrain a redustrual ass redustrual aviewation suplation suplation s