As spiking-based deep learning inference applications are increasing in embedded systems, these systems tend to integrate neuromorphic accelerators such as $\mu$Brain to improve energy efficiency. We propose a $\mu$Brain-based scalable many-core neuromorphic hardware design to accelerate the computations of spiking deep convolutional neural networks (SDCNNs). To increase energy efficiency, cores are designed to be heterogeneous in terms of their neuron and synapse capacity (big cores have higher capacity than the little ones), and they are interconnected using a parallel segmented bus interconnect, which leads to lower latency and energy compared to a traditional mesh-based Network-on-Chip (NoC). We propose a system software framework called SentryOS to map SDCNN inference applications to the proposed design. SentryOS consists of a compiler and a run-time manager. The compiler compiles an SDCNN application into subnetworks by exploiting the internal architecture of big and little $\mu$Brain cores. The run-time manager schedules these sub-networks onto cores and pipeline their execution to improve throughput. We evaluate the proposed big little many-core neuromorphic design and the system software framework with five commonlyused SDCNN inference applications and show that the proposed solution reduces energy (between 37% and 98%), reduces latency (between 9% and 25%), and increases application throughput (between 20% and 36%). We also show that SentryOS can be easily extended for other spiking neuromorphic accelerators.
翻译:随着嵌入系统中基于深度深层学习的推论应用的增加,这些系统倾向于整合神经硬加速器,如$mu$Brain等,以提高能效。我们提出一个$mu$Breax-Brain可伸缩的多核心神经畸形硬件设计,以加速计算深相联神经神经神经网络(SDCNNs)的推移。为了提高能效,核心设计在神经和神经神经神经神经能力(大核心的容量高于小核心)方面是多种多样的,它们使用一个平行的断开的公交连接进行连接,这导致与传统的网基网络(NChip)相比,延缩了内和能量。我们提议了一个叫SentreOS的系统软件框架,用于将SDCNN的推导引力定位到拟议设计中。编译者将SDCNN应用编集到子网络(通过开发大和小于5美元NCREC核心的内建架构,可以使内部结构中的内置值和内置的内置和内置的内置的内置值降低。运行中程管理者们在设计中选择了这些内部的系统内部和内部框架。