As customized accelerator design has become increasingly popular to keep up with the demand for high performance computing, it poses challenges for modern simulator design to adapt to such a large variety of accelerators. Existing simulators tend to two extremes: low-level and general approaches, such as RTL simulation, that can model any hardware but require substantial effort and long execution times; and higher-level application-specific models that can be much faster and easier to use but require one-off engineering effort. This work proposes a compiler-driven simulation workflow that can model configurable hardware accelerator. The key idea is to separate structure representation from simulation by developing an intermediate language that can flexibly represent a wide variety of hardware constructs. We design the Event Queue (EQueue) dialect of MLIR, a dialect that can model arbitrary hardware accelerators with explicit data movement and distributed event-based control; we also implement a generic simulation engine to model EQueue programs with hybrid MLIR dialects representing different abstraction levels. We demonstrate two case studies of EQueue-implemented accelerators: the systolic array of convolution and SIMD processors in a modern FPGA. In the former we show EQueue simulation is as accurate as a state-of-the-art simulator, while offering higher extensibility and lower iteration cost via compiler passes. In the latter we demonstrate our simulation flow can guide designer efficiently improve their design using visualizable simulation outputs.
翻译:由于定制的加速器设计越来越受欢迎,以跟上高性能计算的需求,它给现代模拟器设计带来挑战,使之适应如此之多的加速器。现有的模拟器倾向于两个极端:低层次和一般的方法,如RTL模拟,可以模拟任何硬件,但需要大量努力和较长的执行时间;更高层次的应用程序特定模型,可以更快和更容易地使用,但需要一次性的工程努力。这项工作提议了一个编译器驱动的模拟模拟工作流程,可以模拟可配置的硬件加速器。关键的想法是将结构流的显示与模拟分开,开发一种中间语言,可以灵活地代表多种硬件结构结构结构。我们设计了MLIR的“事件 Quee (EQue) ” 方言,这种方言可以模拟任意硬件加速器,具有明确的数据移动和基于事件的控制;我们还实施了一个通用的模拟引擎,用代表不同抽象级别混合的 MLIR 方言来改进程序。我们展示了两个关于EQUE-FAL 的案例研究,而我们展示的是SIMF 的“我们展示了一个现代的变压式” 方。