There is a growing interest in the use of reduced-precision arithmetic, exacerbated by the recent interest in artificial intelligence, especially with deep learning. Most architectures already provide reduced-precision capabilities (e.g., 8-bit integer, 16-bit floating point). In the context of FPGAs, any number format and bit-width can even be considered.In computer arithmetic, the representation of real numbers is a major issue. Fixed-point (FxP) and floating-point (FlP) are the main options to represent reals, both with their advantages and drawbacks. This chapter presents both FxP and FlP number representations, and draws a fair a comparison between their cost, performance and energy, as well as their impact on accuracy during computations.It is shown that the choice between FxP and FlP is not obvious and strongly depends on the application considered. In some cases, low-precision floating-point arithmetic can be the most effective and provides some benefits over the classical fixed-point choice for energy-constrained applications.
翻译:由于最近对人工智能的兴趣,特别是深层学习,人们对使用降低精确度算术的兴趣日益浓厚,而最近对人工智能的兴趣,尤其是对于深层学习的兴趣则加剧了使用降低精确度算术的兴趣。大多数建筑已经提供了降低精确度的能力(例如,8比位整数,16比位浮动点)。在FPGAs中,任何数字格式和位宽都甚至可以考虑。在计算机算术中,真实数字的表示是一个重大问题。固定点(FxP)和浮动点(FlP)是代表真实物的主要选择,既具有优势,也有缺陷。本章介绍了FxP和FLP数字的表示,对成本、性能和能量以及计算过程中对精确度的影响进行了公平的比较。它表明,FxP和FlP之间的选择并不明显,而且在很大程度上取决于所考虑的应用。在某些情况下,低精确度浮点算算算算算法可能是最有效的,并且为能源受限制的应用提供传统的固定点选择的某些好处。