With the prevalence of deep learning (DL) in many applications, researchers are investigating different ways of optimizing FPGA architecture and CAD to achieve better quality-of-results (QoR) on DL-based workloads. In this optimization process, benchmark circuits are an essential component; the QoR achieved on a set of benchmarks is the main driver for architecture and CAD design choices. However, current academic benchmark suites are inadequate, as they do not capture any designs from the DL domain. This work presents a new suite of DL acceleration benchmark circuits for FPGA architecture and CAD research, called Koios. This suite of 19 circuits covers a wide variety of accelerated neural networks, design sizes, implementation styles, abstraction levels, and numerical precisions. These designs are larger, more data parallel, more heterogeneous, more deeply pipelined, and utilize more FPGA architectural features compared to existing open-source benchmarks. This enables researchers to pin-point architectural inefficiencies for this class of workloads and optimize CAD tools on more realistic benchmarks that stress the CAD algorithms in different ways. In this paper, we describe the designs in our benchmark suite, present results of running them through the Verilog-to-Routing (VTR) flow using a recent FPGA architecture model, and identify key insights from the resulting metrics. On average, our benchmarks have 3.7x more netlist primitives, 1.8x and 4.7x higher DSP and BRAM densities, and 1.7x higher frequency with 1.9x more near-critical paths compared to the widely-used VTR suite. Finally, we present two example case studies showing how architectural exploration for DL-optimized FPGAs can be performed using our new benchmark suite.
翻译:在许多应用程序中,随着深层次学习的普及程度(DL),研究人员正在调查优化FPGA架构和CAD的各种不同方法,以优化基于 DL 工作量的FPGA架构和CAD, 实现更好的成果质量(QoR) 。在这个优化过程中,基准电路是一个必不可少的组成部分;在一套基准的基础上实现的QoR是架构和CAD设计选择的主要驱动力。然而,目前的学术基准套件不够充分,因为它们无法从 DL 域中获取任何设计。 这项工作为FPGA架构和CAD研究提供了一套新的DL加速基准线路套件,称为Koios。这套由19个频谱组成的套子包括各种各样的加速神经网络、设计大小、执行风格、抽象水平和数字精确度。这些设计更大、数据平行、更多样化、更深入的管道,以及比现有的开放源基准更多的FGA的FA建筑特征。 这使得研究人员能够将这一类工作量的建筑效率定为标准,并且将CAD工具优化为更符合现实的基准基准的CAD数据库,以接近的方式进行CAD矩阵的计算。 在本文上,我们用最新的BLIFS流中,我们目前的主要模型中,我们使用了两个模型中,我们使用了我们通过基准标本的模型中所使用的模型中所使用的模型,我们使用。