Modern computer architectures support low-precision arithmetic, which present opportunities for the adoption of mixed-precision algorithms to achieve high computational throughput and reduce energy consumption. As a growing number of scientific computations leverage specialized hardware accelerators, the risk of rounding errors increases, potentially compromising the reliability of models. This shift towards hardware-optimized, low-precision computations highlights the importance of rounding error analysis to ensure that performance gains do not come at the expense of accuracy, especially in high-stakes scientific applications. In this work, we conduct rounding error analysis on widely used operations such as fused multiply-add (FMA), mixed-precision FMA (MPFMA), and NVIDIA Tensor cores. We present a deterministic and probabilistic approach to quantifying the accumulated rounding errors. Numerical experiments are presented to perform the multiply and accumulate operation (MAC) and matrix-matrix multiplication using Tensor cores with random data. We show that probabilistic bounds produce tighter estimates by nearly an order of magnitude compared to deterministic ones for matrix-matrix multiplication.
翻译:暂无翻译