Many applications are increasingly requiring numerical simulations for solving complex problems. Most of these numerical algorithms are massively parallel and often implemented on parallel high-performance computers. However, classic CPU-based platforms suffers due to the demand for higher resolutions and the exponential growth of data. FPGAs offer a powerful and flexible alternative that can host accelerators to complement such platforms. Developing such application-specific accelerators is still challenging because it is hard to provide efficient code for hardware synthesis. In this paper, we study the challenges of porting a numerical simulation kernel onto FPGA. We propose an automated tool flow from a domain-specific language (DSL) to generate accelerators for computational fluid dynamics on FPGA. Our DSL-based flow simplifies the exploration of parameters and constraints such as on-chip memory usage. We also propose a decoupled optimization of memory and logic resources, which allows us to better use the limited FPGA resources. In our preliminary evaluation, this enabled doubling the number of parallel kernels, increasing the accelerator speedup versus ARM execution from 7 to 12 times.
翻译:许多应用都日益需要数字模拟来解决复杂问题。这些数字算法大多是大规模平行的,而且往往在平行的高性能计算机上实施。然而,传统的CPU平台由于对高分辨率的需求和数据指数增长而受到影响。 FPGAs提供了一种强大而灵活的替代方法,可以容纳加速器来补充这些平台。开发这种具体应用的加速器仍然具有挑战性,因为很难提供有效的硬件合成代码。在本文中,我们研究了将数字模拟内核移植到FPGA上的挑战。我们建议从一个特定域语言(DSL)中自动输入一个工具,以生成计算液动态的加速器。我们基于DSL的流量简化了对参数和限制的探索,例如芯内存使用等。我们还提议对记忆和逻辑资源进行解析优化,从而使我们能够更好地利用有限的FPGA资源。在初步评估中,使平行内核的数量翻了一番,从而将加速加速器的速度从7次增加到12次。