With an ever-growing number of parameters defining increasingly complex networks, Deep Learning has led to several breakthroughs surpassing human performance. As a result, data movement for these millions of model parameters causes a growing imbalance known as the memory wall. Neuromorphic computing is an emerging paradigm that confronts this imbalance by performing computations directly in analog memories. On the software side, the sequential Backpropagation algorithm prevents efficient parallelization and thus fast convergence. A novel method, Direct Feedback Alignment, resolves inherent layer dependencies by directly passing the error from the output to each layer. At the intersection of hardware/software co-design, there is a demand for developing algorithms that are tolerable to hardware nonidealities. Therefore, this work explores the interrelationship of implementing bio-plausible learning in-situ on neuromorphic hardware, emphasizing energy, area, and latency constraints. Using the benchmarking framework DNN+NeuroSim, we investigate the impact of hardware nonidealities and quantization on algorithm performance, as well as how network topologies and algorithm-level design choices can scale latency, energy and area consumption of a chip. To the best of our knowledge, this work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa. The best results achieved for accuracy remain Backpropagation-based, notably when facing hardware imperfections. Direct Feedback Alignment, on the other hand, allows for significant speedup due to parallelization, reducing training time by a factor approaching N for N-layered networks.
翻译:随着界定日益复杂的网络的参数数量不断增加,深学习导致若干突破超过人类性能的突破。因此,这些数百万模型参数的数据流动导致被称为记忆墙的数据流动导致日益失衡。神经地貌计算是一个新出现的范例,它通过直接在模拟记忆中进行计算来应对这种不平衡。在软件方面,相继的回溯式演算法阻碍高效的平行化,从而导致快速趋同。一种新型方法,即直接反馈协调,通过将产出错误直接传递到每一层来解决内在的层依赖性。在硬件/软件共同设计交叉点上,需要开发一种可适应硬件非理想性的平行性算法。因此,这项工作探索了在神经形态硬件上进行生物可变性学习的相互关系,强调能源、面积和内衣限制。使用基准框架DNNN+Neurosim,我们调查硬件非理想性、后退性化对算法性业绩的影响,以及网络顶级和算式设计选择如何能大幅提升耐硬件非理想性,从而能够将我们的最佳硬性知识的精度与直径可对比。