神经网络异异相近性的梯度和概率结合 (Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks)

This work explores the search for heterogeneous approximate multiplier configurations for neural networks that produce high accuracy and low energy consumption. We discuss the validity of additive Gaussian noise added to accurate neural network computations as a surrogate model for behavioral simulation of approximate multipliers. The continuous and differentiable properties of the solution space spanned by the additive Gaussian noise model are used as a heuristic that generates meaningful estimates of layer robustness without the need for combinatorial optimization techniques. Instead, the amount of noise injected into the accurate computations is learned during network training using backpropagation. A probabilistic model of the multiplier error is presented to bridge the gap between the domains; the model estimates the standard deviation of the approximate multiplier error, connecting solutions in the additive Gaussian noise space to actual hardware instances. Our experiments show that the combination of heterogeneous approximation and neural network retraining reduces the energy consumption for multiplications by 70% to 79% for different ResNet variants on the CIFAR-10 dataset with a Top-1 accuracy loss below one percentage point. For the more complex Tiny ImageNet task, our VGG16 model achieves a 53 % reduction in energy consumption with a drop in Top-5 accuracy of 0.5 percentage points. We further demonstrate that our error model can predict the parameters of an approximate multiplier in the context of the commonly used additive Gaussian noise (AGN) model with high accuracy. Our software implementation is available under https://github.com/etrommer/agn-approx.

翻译：这项工作探索了为神经网络寻找具有高精度和低能消耗度的杂质近效倍增配置,我们讨论了在精确神经网络计算中添加的添加高斯噪音作为近效乘数模拟行为模型的替代模型的有效性。加高斯噪音模型所覆盖的解决方案空间的连续和可区别特性被用作一种超常性,在不需要组合优化技术的情况下,对层稳健性作出有意义的估计。相反,在使用反向调整进行网络培训时,可以了解准确计算时所注入的噪音数量。一个乘数错误的概率模型可以弥合各个区域之间的差距;模型估计了近似倍数错误的标准偏差,将加高体噪音空间中的解决办法与实际硬件实例联系起来。我们的实验表明,混合近效和神经网络再培训相结合,可以将CIFAR-10数据模型中不同的ResNet变异体的能量消耗量减少70%至79%,其顶值的精确度损失低于一个百分点。对于较复杂的TimNet的误差模型来说,我们的VGGGA-5的精确度差差差差值是我们GGGA的数值/GOA的精确度。我们GO-VGI-VG16的精确度模型的精确度模型的精确度,可以进一步降低了我们GOVLA的能量/GILA的精确度。