Manticore: 以静态散积同步平行主义模拟硬件加速式RTL</s> (Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism)

The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware depend heavily upon cycle-accurate simulation of register-transfer-level (RTL) designs. The fastest software RTL simulators can simulate designs at 1--1000 kHz, i.e., more than three orders of magnitude slower than hardware. Improved simulators can increase designers' productivity by speeding design iterations and permitting more exhaustive exploration. One possibility is to exploit low-level parallelism, as RTL expresses considerable fine-grain concurrency. Unfortunately, state-of-the-art RTL simulators often perform best on a single core since modern processors cannot effectively exploit fine-grain parallelism. This work presents Manticore: a parallel computer designed to accelerate RTL simulation. Manticore uses a static bulk-synchronous parallel (BSP) execution model to eliminate fine-grain synchronization overhead. It relies entirely on a compiler to schedule resources and communication, which is feasible since RTL code contains few divergent execution paths. With static scheduling, communication and synchronization no longer incur runtime overhead, making fine-grain parallelism practical. Moreover, static scheduling dramatically simplifies processor implementation, significantly increasing the number of cores that fit on a chip. Our 225-core FPGA implementation running at 475 MHz outperforms a state-of-the-art RTL simulator running on desktop and server computers in 8 out of 9 benchmarks.

翻译：Moore 法律和 Dennard 缩略语的消亡重新唤醒了人们对专门计算机架构和加速器的兴趣。该硬件的核查和测试主要取决于对注册转移级别(RTL)设计进行周期精确模拟。最快的软件 RTL 模拟器可以模拟1--1000 kHz的设计, 即比硬件慢了三个以上的数量级。改进的模拟器可以通过加速设计迭代和允许更彻底的探索来提高设计师的生产率。一个可能性是利用低水平的平行, 因为RTL 表示相当的精细的调调调和货币。不幸的是, 最先进的RTL 模拟器往往在单一的核心上表现得最好, 因为现代处理器无法有效地利用微重重力平行的模拟器。这项工作展示了Manticore:一个平行的计算机,用来加速RTL 模拟的模拟器。改进的模拟器使用一个固定的散态同步执行模型来消除精细的同步式计算机管理。它完全依靠一个编辑器来安排资源和通讯,这是相当精细的精细的调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调调的调的调的调的调的调的调的调调, 调调调调调调调调调调调调调调调调调调调调调调调调调调的调的调的调的调的调的调,, 调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调制的调调调制的调调制的调调制的调制的调制的调制的调制的调制</s>