Sparsity is a growing trend in modern DNN models. Existing Sparse-Sparse Matrix Multiplication (SpMSpM) accelerators are tailored to a particular SpMSpM dataflow (i.e., Inner Product, Outer Product or Gustavsons), that determines their overall efficiency. We demonstrate that this static decision inherently results in a suboptimal dynamic solution. This is because different SpMSpM kernels show varying features (i.e., dimensions, sparsity pattern, sparsity degree), which makes each dataflow better suited to different data sets. In this work we present Flexagon, the first SpMSpM reconfigurable accelerator that is capable of performing SpMSpM computation by using the particular dataflow that best matches each case. Flexagon accelerator is based on a novel Merger-Reduction Network (MRN) that unifies the concept of reducing and merging in the same substrate, increasing efficiency. Additionally, Flexagon also includes a 3-tier memory hierarchy, specifically tailored to the different access characteristics of the input and output compressed matrices. Using detailed cycle-level simulation of contemporary DNN models from a variety of application domains, we show that Flexagon achieves average performance benefits of 4.59x, 1.71x, and 1.35x with respect to the state-of-the-art SIGMA-like, Sparch-like and GAMMA-like accelerators (265% , 67% and 18%, respectively, in terms of average performance/area efficiency).
翻译:在现代 DNN 模型中, 差异是一个日益增长的趋势。 现有的 Sprass- Sparse 矩阵乘积( SpMSpMM) 加速器是专门为 SpMSpMM 特定数据流( 内产产品、 外产产品或 Gustavsons) 定制的, 这决定了它们的总体效率。 我们证明, 这种静态决定必然导致亚最佳动态解决方案。 这是因为不同的 SpMSpMMM 内核显示不同的特性( 即, 尺寸、 宽度模式、 宽度度度度度), 这使得每个数据流更适合不同的数据集。 在这项工作中, 我们展示了 Flexaminal- deal lax 数据流结构, 特别符合 SMSpMSMMM 的可重新配置数据流, 并且能够使用与每种情况相匹配的特定数据流流数据流。 Flexalga- Redustration 网络( MRANM) 概念化概念( 同一子体、 递增效率)。 此外, Flexa- sal- respilalal- redustrualal- dealalalalalalalalal 和SIMA drealalalal 4, 和SIGLIGMDalalalalalalal 等级标准, 和我们使用一个不同的SIMLMLMIFDMDMA 格式的高级性性性性能性能性能特性。