Tiled spatial architectures have proved to be an effective solution to build large-scale DNN accelerators. In particular, interconnections between tiles are critical for high performance in these tile-based architectures. In this work, we identify the inefficiency of the widely used traditional on-chip networks and the opportunity of software-hardware co-design. We propose METRO with the basic idea of decoupling the traffic scheduling policies from hardware fabrics and moving them to the software level. METRO contains two modules working in synergy: METRO software scheduling framework to coordinate the traffics and METRO hardware facilities to deliver the data based on software configurations. We evaluate the co-design using different flit sizes for synthetic study, illustrating its effectiveness under various hardware resource constraints, in addition to a wide range of DNN models selected from real-world workloads. The results show that METRO achieves 56.3% communication speedup on average and up to 73.6% overall processing time reduction compared with traditional on-chip network designs.
翻译:平板空间结构已证明是建造大型 DNN加速器的有效解决办法。 特别是, 瓷砖之间的互联对于这些基于瓷砖的建筑的高性能至关重要。 在这项工作中, 我们发现广泛使用的传统芯片网络效率低下, 以及软件硬件共同设计的机会。 我们建议MEDRO 采用将交通调度政策与硬件结构脱钩并将它们移到软件一级的基本想法。 METRO 包含两个协同作用模块: METRO 软件排期框架, 以协调流量和METRO硬件设施, 以交付基于软件配置的数据。 我们用不同的软体尺寸来评估联合设计, 说明其在各种硬件资源限制下的有效性, 除了从实际世界工作量中选择的多种DNNN模型之外。 结果显示MERO 平均实现了56.3%的通信速度, 与传统的芯片网络设计相比, 总体处理时间缩短了73.6%。