FPGAs are an attractive type of accelerator for all-purpose HPC computing systems due to the possibility of deploying tailored hardware on demand. However, the common tools for programming and operating FPGAs are still complex to use, specially in scenarios where diverse types of tasks should be dynamically executed. In this work we present a programming abstraction with a simple interface that internally leverages High-Level Synthesis, Dynamic Partial Reconfiguration and synchronisation mechanisms to use an FPGA as a multi-tasking server with preemptive scheduling and priority queues. This leads to a better use of the FPGA resources, allowing the execution of several kernels at the same time and deploying the most urgent ones as fast as possible. The results of our experimental study show that our approach incurs only a 1.66% overhead when using only one Reconfigurable Region (RR), and 4.04% when using two RRs, whilst presenting a significant performance improvement over the traditional non-preemptive full reconfiguration approach.
翻译:由于有可能在需求时部署量身定制的硬件,FPGA是所有目的HPC计算机系统的一种有吸引力的加速器。然而,用于编程和操作FPGA的通用工具仍然很复杂,特别是在应动态执行不同类型任务的情景下。在这项工作中,我们提出了一个程序抽象,有一个简单的界面,在内部利用高级合成、动态部分重新配置和同步机制,利用FPGA作为多任务服务器,带有先发制人的排队和优先排队。这导致更好地利用FPGA资源,允许同时执行几个内核,并尽可能迅速地部署最紧迫的内核。我们实验研究的结果显示,我们的方法仅使用一个可重新配置区域(RR)就只产生1.66%的间接费用,在使用两个RR(R)时则产生4.04%的间接费用,同时对传统的非先发制人全面重组办法产生显著的绩效改进。