从电路到 SoC 处理器:用于DSP加速的自对称近似技术与嵌入计算机方法 (From Circuits to SoC Processors: Arithmetic Approximation Techniques & Embedded Computing Methodologies for DSP Acceleration)

The computing industry is forced to find alternative design approaches and computing platforms to sustain increased power efficiency, while providing sufficient performance. Among the examined solutions, Approximate Computing, Hardware Acceleration, and Heterogeneous Computing have gained great momentum. In this Dissertation, we introduce design solutions and methodologies, built on top of the preceding computing paradigms, for the development of energy-efficient DSP and AI accelerators. In particular, we adopt the promising paradigm of Approximate Computing and apply new approximation techniques in the design of arithmetic circuits. The proposed arithmetic approximation techniques involve bit-level optimizations, inexact operand encodings, and skipping of computations, while they are applied in both fixed- and floating-point arithmetic. We also conduct an extensive exploration on combinations among the approximation techniques and propose a low-overhead scheme for seamlessly adjusting the approximation degree of our circuits at runtime. Based on our methodology, these arithmetic approximation techniques are then combined with hardware design techniques to implement approximate ASIC- and FPGA-based DSP and AI accelerators. Moreover, we propose methodologies for the efficient mapping of DSP/AI kernels on distinctive embedded devices, i.e., the space-grade FPGAs and the heterogeneous VPUs. On the one hand, we cope with the decreased flexibility of the space-grade technology and the technical challenges that arise in new FPGA tools. On the other hand, we unlock the full potential of heterogeneity by exploiting all the diverse processors and memories. Based on our methodology, we efficiently map computer vision algorithms onto the radiation-hardened NanoXplore's FPGAs and accelerate DSP & CNN kernels on Intel's Myriad VPUs.

翻译：计算机产业被迫寻找替代设计方法和计算平台以维持更高的电力效率,同时提供足够的性能。在所研究的解决方案中, 近似电子计算、硬件加速、异质计算等获得了巨大的动力。在本次研究中, 我们还在先前的计算范式之上引入了设计解决方案和方法, 以开发节能的 DSP 和 AI 加速器。特别是, 我们采用了“ 近似计算” 的有希望的范式, 并在计算电路的设计中应用新的近似技术。拟议的算术近似技术包括比分级优化、不精密的操作编码和跳过计算, 而它们同时被用于固定和浮动的计算。我们还在近似技术的组合上进行了广泛的探索, 并提出了一个低头计划, 以无缝调整我们电路路的近度。根据我们的方法, 这些算术的近似近似精度技术与硬件的硬件设计技术结合, 以近似的ASIC- 和FA 和AI 加速计算方法。此外, 我们提议在SGA 系统上高效的SDA- 系统、和SD- FAFDFD 系统上, 不断的智能的系统, 不断, 不断的系统的系统和和不断不断, 的系统和不断不断不断不断系统系统。