There is increasing demand for specialized hardware for training deep neural networks, both in edge/IoT environments and in high-performance computing systems. The design space of such hardware is very large due to the wide range of processing architectures, deep neural network configurations, and dataflow options. This makes developing deep neural network processors quite complex, especially for training. We present TRIM, an infrastructure to help hardware architects explore the design space of deep neural network accelerators for both inference and training in the early design stages. The model evaluates at the whole network level, considering both inter-layer and intra-layer activities. Given applications, essential hardware specifications, and a design goal, TRIM can quickly explore different hardware design options, select the optimal dataflow and guide new hardware architecture design. We validated TRIM with FPGA-based implementation of deep neural network accelerators and ASIC-based architectures. We also show how to use TRIM to explore the design space through several case studies. TRIM is a powerful tool to help architects evaluate different hardware choices to develop efficient inference and training architecture design.
翻译:由于各种处理结构、深神经网络配置和数据流选择,这些硬件的设计空间很大。这使得开发深神经网络处理器十分复杂,特别是培训方面。我们介绍了TRIM,这是帮助硬件建筑师探索深神经网络加速器的设计空间,用于早期设计阶段的推断和培训。模型评估整个网络层面,既考虑到机构间活动,也考虑到层内活动。鉴于各种应用、基本硬件规格和设计目标,TRIM可以迅速探索不同的硬件设计选项,选择最佳数据流并指导新的硬件结构设计。我们通过基于FPGA的深神经网络加速器和ASICE建筑,验证TRIM。我们还展示了如何通过几项案例研究利用TRIM来探索设计空间。TRIM是一个强大的工具,帮助建筑设计师评估不同的硬件选择,以开发高效的推断和培训设计。