In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of convenient APIs that allow users to easily express flexible and complex inter-task communication structures. Second, TAPA adopts a coarse-grained floorplanning step during HLS compilation for accurate pipelining of potential critical paths. In addition, TAPA implements several optimization techniques specifically tailored for modern HBM-based FPGAs. In our experiments with a total of 43 designs, we improve the average frequency from 147 MHz to 297 MHz (a 102% improvement) with no loss of throughput and a negligible change in resource utilization. Notably, in 16 experiments we make the originally unroutable designs achieve 274 MHz on average. The framework is available at and the core floorplan module is available at
翻译:在本文中,我们提议TAPA,这是一个将C++任务平行数据流程序编成高频FPGA加速器的端到端框架。与现有解决方案相比,TAPA有两大优势。首先,TAPA提供一套方便的API,使用户能够方便地表达灵活和复杂的跨任务通信结构。第二,TAPA在HLS汇编中采用了粗略的底部规划步骤,以准确描述潜在关键路径的管道。此外,TAPA还采用专门为现代HBM-基于FPGAs设计的几种优化技术。在总共43种设计中的实验中,我们把平均频率从147MHMz提高到297 MHz(a 102%的改进幅度),没有流失量和资源利用方面的微小变化。值得注意的是,在16个实验中,我们使原无路图的设计平均达到274 MHzgz。框架可在上查阅,核心地面规划模块可在上查阅。