In this paper we present Arrow, a configurable hardware accelerator architecture that implements a subset of the RISC-V v0.9 vector ISA extension aimed at edge machine learning inference. Our experimental results show that an Arrow co-processor can execute a suite of vector and matrix benchmarks fundamental to machine learning inference 2 - 78x faster than a scalar RISC processor while consuming 20% - 99% less energy when implemented in a Xilinx XC7A200T-1SBG484C FPGA.
翻译:在本文中我们介绍箭头,这是一个可配置的硬件加速器结构,它执行RISC-Vv0.9矢量 ISA扩展的子集,目的是用边缘机器学习推理。我们的实验结果表明,箭头共同处理器可以执行一套矢量和矩阵基准,对于机器学习推理来说,这些矢量和矩阵基准比标量的RISC处理器更快2 - 78x,而当在Xilinx XC7A200T-1SBG484C FPGA实施时,则消耗20% - 99%的能量。