Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory-intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here, we evaluate the possibility to use even FP16 and Posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop novel 16-bit formats - based on a modified IEEE-754 and on a modified Posit standard - that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method) and finally the impact of a raindrop (based on a Volume-of-Fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.
翻译:使用 lattice Boltzmann 方法( LBM ) 的流体动态模拟( 流体动态模拟) 是非常记忆密集的。 在减少记忆足迹的同时, 使用 FP32 (单) 精度比 FP64 (双) 精度可以实现显著的绩效效益, 特别是在 GPUs 上。 这里, 我们评估了使用 甚至 FP16 和 Posit 16 (半) 精度来存储流体人口的可能性, 同时仍然在 FP32 中进行计算操作。 在这方面, 我们首先显示 LBM 中常见的数字范围远小于 FP16 数量。 根据这项观察, 我们开发了新型的16位格式( 以修改后的 IEEE- 754 和修改后的 Positit 标准为基础) 16 。 与 FPBM 的精度具体针对 LBM 需要。 然后, 我们对六种不同测试系统的LBM 精度进行了深度描述: Poiseuill 流、 Tay- Gart 、Karman vortance、 libled cal cal dal 和 main 方法最后显示了多少 的精度的精度大小。 我们发现所有FFFP- 的精度分析。