Given the importance of floating-point~(FP) performance in numerous domains, several new variants of FP and its alternatives have been proposed (e.g., Bfloat16, TensorFloat32, and Posits). These representations do not have correctly rounded math libraries. Further, the use of existing FP libraries for these new representations can produce incorrect results. This paper proposes a novel methodology for generating polynomial approximations that can be used to implement correctly rounded math libraries. Existing methods produce polynomials that approximate the real value of an elementary function $f(x)$ and experience wrong results due to errors in the approximation and due to rounding errors in the implementation. In contrast, our approach generates polynomials that approximate the correctly rounded value of $f(x)$ (i.e., the value of $f(x)$ rounded to the target representation). This methodology provides more margin to identify efficient polynomials that produce correctly rounded results for all inputs. We frame the problem of generating efficient polynomials that produce correctly rounded results as a linear programming problem. Our approach guarantees that we produce the correct result even with range reduction techniques. Using our approach, we have developed correctly rounded, yet faster, implementations of elementary functions for multiple target representations. Our Bfloat16 library is 2.3$\times$ faster than the corresponding state-of-the-art while producing correct results for all inputs.
翻译:鉴于浮动点~(FP)在许多领域的重要性,提出了几个FP及其替代品的新变种(如Bfloat16、TensorFloat32和Posits)。这些表示方式没有正确四舍五入的数学图书馆。此外,为这些新的表示方式利用现有的FP图书馆可以产生不正确四舍五入的结果。本文件建议了一种创新的方法,用于产生能够用来执行正确四舍五入的数学图书馆的多元近似值。现有方法产生了一些多元数值,接近一个基本函数的实际价值$(x),并由于近似差和由于执行中的四舍五入错误而得出错误的结果。相比之下,我们的方法产生的多元数值接近于正确四舍五入的值(x)16美元(即四舍五入的值),可以得出不正确的四舍五入的数值。我们的方法提供了更大的余地,可以找出能够正确得出所有投入结果的高效的多元数值。我们将产生有效的多数值,即正确四舍五入为直线形对应的编制方案的问题。我们的方法保证了在采用更精确的方式后,我们用更精确地得出了正确的四舍五入的方式。