Edge-computing requires high-performance energy-efficient embedded systems. Fixed-function or custom accelerators, such as FFT or FIR filter engines, are very efficient at implementing a particular functionality for a given set of constraints. However, they are inflexible when facing application-wide optimizations or functionality upgrades. Conversely, programmable cores offer higher flexibility, but often with a penalty in area, performance, and, above all, energy consumption. In this paper, we propose VWR2A, an architecture that integrates high computational density and low power memory structures (i.e., very-wide registers and scratchpad memories). VWR2A narrows the energy gap with similar or better performance on FFT kernels with respect to an FFT accelerator. Moreover, VWR2A flexibility allows to accelerate multiple kernels, resulting in significant energy savings at the application level.
翻译:固定功能或定制加速器,如FFFT或FIR过滤引擎,在为特定限制装置实施特定功能方面非常高效,然而,当面临全应用优化或功能升级时,这些功能不灵活。相反,可编程核心提供更大的灵活性,但通常在领域、性能和能源消耗方面会受到处罚。在本文件中,我们提议VWR2A是一个将高计算密度和低功率内存结构(即广域登记册和刮痕记忆)整合在一起的结构。 VWR2A缩小了FFFT内核在FFT加速器方面的能量差距,其性能相似或更好。此外,VWS2A灵活性允许加速多个内核,从而在应用层面节省了大量能源。