HIMA:一个基于历史的快速和可缩缩历史记忆接入引擎,用于不同神经计算机 (HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer)

Memory-augmented neural networks (MANNs) provide better inference performance in many tasks with the help of an external memory. The recently developed differentiable neural computer (DNC) is a MANN that has been shown to outperform in representing complicated data structures and learning long-term dependencies. DNC's higher performance is derived from new history-based attention mechanisms in addition to the previously used content-based attention mechanisms. History-based mechanisms require a variety of new compute primitives and state memories, which are not supported by existing neural network (NN) or MANN accelerators. We present HiMA, a tiled, history-based memory access engine with distributed memories in tiles. HiMA incorporates a multi-mode network-on-chip (NoC) to reduce the communication latency and improve scalability. An optimal submatrix-wise memory partition strategy is applied to reduce the amount of NoC traffic; and a two-stage usage sort method leverages distributed tiles to improve computation speed. To make HiMA fundamentally scalable, we create a distributed version of DNC called DNC-D to allow almost all memory operations to be applied to local memories with trainable weighted summation to produce the global memory output. Two approximation techniques, usage skimming and softmax approximation, are proposed to further enhance hardware efficiency. HiMA prototypes are created in RTL and synthesized in a 40nm technology. By simulations, HiMA running DNC and DNC-D demonstrates 6.47x and 39.1x higher speed, 22.8x and 164.3x better area efficiency, and 6.1x and 61.2x better energy efficiency over the state-of-the-art MANN accelerator. Compared to an Nvidia 3080Ti GPU, HiMA demonstrates speedup by up to 437x and 2,646x when running DNC and DNC-D, respectively.

翻译：内存放大神经网络( MANND) 提供外部记忆帮助下, 在许多任务中提供更好的推断性能。最近开发的不同神经计算机(DNNC) 是一个MAN, 显示在代表复杂的数据结构和学习长期依赖性方面表现优于表现。 DNC 的更高性能来自基于历史的新关注机制以及先前使用的基于内容的注意机制。基于历史的机制需要各种新的计算性软数据原始和状态记忆, 而现有的神经网络(NNN) 或 MAN 高级加速器则不支持这些原始和状态记忆。我们展示了HIMA, 一个基于历史的39个基于历史的内存访问引擎, 以在提盘中分布的记忆结构。 HIMA包含一个多式的网络在芯片上(NC), 以降低通信的延迟性能, 并改进基于子节点的存储节能节能间隔战略, 使用两个阶段的平流方法可以提高计算速度。使HMA 基本可升级, 运行的内存中, 将一个可发送式的内装的内存的DNCMD- mal- dIMD 生成的机, 向二号向二号演示中, 向二进制成一个可发送的智能同步输出的内压的输出输出。 DNC- dNC- dNC- dNC- dNC- dNC- dx 。

相关内容

Neural Computation

关注 1104

神经计算（Neural Computation）期刊传播在理论、建模、计算方面的重要的多学科的研究，在神经科学统计和建设神经启发信息处理系统。这个领域吸引了心理学家、物理学家、计算机科学家、神经科学家和人工智能研究人员，他们致力于研究感知、情感、认知和行为背后的神经系统，以及具有类似能力的人工神经系统。由BRAIN Initiative开发的强大的新实验技术将产生大量复杂的数据集，严谨的统计分析和理论洞察力对于理解这些数据的含义至关重要。及时的、简短的交流、完整的研究文章以及对该领域进展的评论，涵盖了神经计算的所有方面。官网地址：http://dblp.uni-trier.de/db/journals/neco/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日