Significant effort has been placed on the development of toolflows that map Convolutional Neural Network (CNN) models to Field Programmable Gate Arrays (FPGAs) with the aim of automating the production of high performing designs for a diverse set of applications. However, within these toolflows, the problem of finding an optimal mapping is often overlooked, with the expectation that the end user will tune their generated hardware for their desired platform. This is particularly prominent within Streaming Architecture toolflows, where there is a large design space to explore . In this work, we establish the framework SAMO: a Streaming Architecture Mapping Optimiser. SAMO exploits the structure of CNN models and the common features that exist in Streaming Architectures, and casts the mapping optimisation problem under a unified methodology. Furthermore, SAMO explicitly explores the reconfigurability property of FPGAs, allowing the methodology to overcome mapping limitations imposed by certain toolflows under resource-constrained scenarios, as well as improve on the achievable throughput. Three optimisation methods - Brute-Force, Simulated Annealing and Rule-Based - have been developed in order to generate valid, high performance designs for a range of target platforms and CNN models. Results show that SAMO-optimised designs can achieve 4-20x better performance compared to existing hand-tuned designs. The SAMO framework is open-source: https://github.com/AlexMontgomerie/samo.
翻译:开发工具流,将革命神经网络(CNN)模型映射成可编程门阵列(FPGAs),目的是为多种应用程序的制作自动化高性能设计设计,然而,在这些工具流中,寻找最佳绘图的问题往往被忽视,期望终端用户将调整其生成的硬件用于其理想平台。这在流动结构工具流中尤为突出,那里有大的设计空间可以探索。在这项工作中,我们建立了SAMO框架:一个流动结构映像Optimiser。SAMO利用CNN模型的结构和在流动结构图中存在的共同特征,并在统一的方法下将绘图优化问题抛在一边。此外,SAMO明确探索FGAs可重新配置硬件的特性,使某些工具流在资源配置松散的框架中施加的绘图限制得以克服,并改进可实现的透版。三种优化方法- Brute-Formus-Simopex Americal-Developments a lavelopments a ladal-lavelopmental-deal-deal-deal-dection Arruptraction-Sal-de) a SAmodeal-Samment atravelopment a stret