This work proposes a novel Energy-Aware Network Operator Search (ENOS) approach to address the energy-accuracy trade-offs of a deep neural network (DNN) accelerator. In recent years, novel inference operators have been proposed to improve the computational efficiency of a DNN. Augmenting the operators, their corresponding novel computing modes have also been explored. However, simplification of DNN operators invariably comes at the cost of lower accuracy, especially on complex processing tasks. Our proposed ENOS framework allows an optimal layer-wise integration of inference operators and computing modes to achieve the desired balance of energy and accuracy. The search in ENOS is formulated as a continuous optimization problem, solvable using typical gradient descent methods, thereby scalable to larger DNNs with minimal increase in training cost. We characterize ENOS under two settings. In the first setting, for digital accelerators, we discuss ENOS on multiply-accumulate (MAC) cores that can be reconfigured to different operators. ENOS training methods with single and bi-level optimization objectives are discussed and compared. We also discuss a sequential operator assignment strategy in ENOS that only learns the assignment for one layer in one training step, enabling greater flexibility in converging towards the optimal operator allocations. Furthermore, following Bayesian principles, a sampling-based variational mode of ENOS is also presented. ENOS is characterized on popular DNNs ShuffleNet and SqueezeNet on CIFAR10 and CIFAR100.
翻译:这项工作提出了一个新的能源-软件网络操作员搜索(ENOS)方法,以解决深神经网络加速器的能源-准确性权衡问题。近年来,提出了新的推论操作员,以提高DNN的计算效率。还探讨了操作员的升级,他们相应的新计算模式。然而,简化DNN操作员总是以较低的准确性为代价,特别是在复杂的处理任务方面。我们提议的ENOS框架允许将推断操作员和计算模式进行最佳的分层整合,以实现理想的能源和准确性平衡。ENOS的搜索被设计成一个连续的优化问题,使用典型的梯度下降方法可以溶解到更大的DNNNNP,从而在培训成本上最小增加。我们把ENOS描述为两种环境加速器。在第一个设置中,我们讨论ENOS的多重累积(MAC)核心核心,可以重新配置给不同的操作员。ENOS的培训方法与单级和双级的优化目标进行了讨论并进行了比较。在SENOS的升级分配战略中,在更优化的顺序操作员分配方面,在SENOS的升级模式上,在SLA级培训中也只是学习了一级的灵活性。