Low-precision formats have recently driven major breakthroughs in neural network (NN) training and inference by reducing the memory footprint of the NN models and improving the energy efficiency of the underlying hardware architectures. Narrow integer data types have been vastly investigated for NN inference and have successfully been pushed to the extreme of ternary and binary representations. In contrast, most training-oriented platforms use at least 16-bit floating-point (FP) formats. Lower-precision data types such as 8-bit FP formats and mixed-precision techniques have only recently been explored in hardware implementations. We present MiniFloat-NN, a RISC-V instruction set architecture extension for low-precision NN training, providing support for two 8-bit and two 16-bit FP formats and expanding operations. The extension includes sum-of-dot-product instructions that accumulate the result in a larger format and three-term additions in two variations: expanding and non-expanding. We implement an ExSdotp unit to efficiently support in hardware both instruction types. The fused nature of the ExSdotp module prevents precision losses generated by the non-associativity of two consecutive FP additions while saving around 30% of the area and critical path compared to a cascade of two expanding fused multiply-add units. We replicate the ExSdotp module in a SIMD wrapper and integrate it into an open-source floating-point unit, which, coupled to an open-source RISC-V core, lays the foundation for future scalable architectures targeting low-precision and mixed-precision NN training. A cluster containing eight extended cores sharing a scratchpad memory, implemented in 12 nm FinFET technology, achieves up to 575 GFLOPS/W when computing FP8-to-FP16 GEMMs at 0.8 V, 1.26 GHz.
翻译:低精度格式最近通过减少NN模型的记忆足迹和提高基础硬件结构的能效,推动了神经网络培训和推断方面的重大突破。 NER 推导力已经对NN的精确整形数据类型进行了广泛的调查,并成功地推向了最极端的恒定和二进式表达式。相比之下,大多数面向培训的平台至少使用了16位浮点(FP)格式。低精度数据类型,如8比方FP格式和混合精度技术,直到最近才在硬件实施中进行探索。我们介绍了NNN模型的记忆足足迹-NNN,这是RISC-V指令的架构扩展,用于低精度NNC的测试,为两张8比和2比16比方FP格式提供了支持,并扩展了操作。与此不同的是,多数产品指令组合,将结果累积成更大的格式,在两种变异式中:扩大和非扩展的FPFP格式。我们用ExS-NPO+P 格式在硬性指令类型中高效地支持硬件,将IMFS的硬质和连续的IMFS级结构在1级中将IMFLIMBID级中将一个直径的缩缩缩缩成2比。