Machine learning methods like neural networks are extremely successful and popular in a variety of applications, however, they come at substantial computational costs, accompanied by high energy demands. In contrast, hardware capabilities are limited and there is evidence that technology scaling is stuttering, therefore, new approaches to meet the performance demands of increasingly complex model architectures are required. As an unsafe optimization, noisy computations are more energy efficient, and given a fixed power budget also more time efficient. However, any kind of unsafe optimization requires counter measures to ensure functionally correct results. This work considers noisy computations in an abstract form, and gears to understand the implications of such noise on the accuracy of neural-network-based classifiers as an exemplary workload. We propose a methodology called "Walking Noise" that allows to assess the robustness of different layers of deep architectures by means of a so-called "midpoint noise level" metric. We then investigate the implications of additive and multiplicative noise for different classification tasks and model architectures, with and without batch normalization. While noisy training significantly increases robustness for both noise types, we observe a clear trend to increase weights and thus increase the signal-to-noise ratio for additive noise injection. For the multiplicative case, we find that some networks, with suitably simple tasks, automatically learn an internal binary representation, hence becoming extremely robust. Overall this work proposes a method to measure the layer-specific robustness and shares first insights on how networks learn to compensate injected noise, and thus, contributes to understand robustness against noisy computations.
翻译:类似神经神经网络的机器学习方法非常成功,在各种应用中非常流行,但是,它们的计算成本很高,同时能源需求也很高。相比之下,硬件能力有限,而且有证据表明技术规模是松散的,因此,需要采用新的方法来满足日益复杂的模型结构的性能需求。由于不安全的优化,噪音计算更节省能源,而且具有固定的电力预算也更有效率。然而,任何不安全的优化都需要反向措施,以确保功能正确的结果。这项工作考虑的是抽象的杂音计算,以及理解这种噪音对神经网络基于网络的精度的影响,以此作为一个模范工作量。我们建议一种叫做“挥舞噪音”的方法,以便通过所谓的“中位噪音水平”衡量标准来评估不同层次结构的稳健性。我们然后调查添加和多倍增噪音对不同分类任务和模型结构的影响,同时不分批地正常化。在对两种噪音类型进行大幅增强稳度的计算的同时,我们观察到一种明显的趋势,即对以神经-网络的精准性进行增量,从而自动地提高信号-比例。我们学习了一种普通的循环, 学习一个普通的层次, 学习一个普通的层次, 学习一个简单的层次, 研究, 学习一个普通的, 学习一个普通的, 学习一个普通的, 学习一个普通的层次, 学习一个普通的, 学习一个普通的, 的, 学习一个过程的, 学习一个普通的, 学习一个普通的, 学习一个普通的, 学的, 的, 学习一个过程的, 学的, 学习一个普通的, 的, 的, 的, 的, 的, 的, 的, 学习一个简单的, 学的, 的, 的, 学的, 学的, 的, 学的, 的, 的, 的, 的, 的, 的, 学的, 的, 的, 的, 学习的, 学习的, 学习的, 学习的, 学的, 学的, 学的, 学的, 学的, 的, 学的, 学习的, 学的, 学的, 学的, 学的, 学的, 学的, 学的, 学的,