MultPIM: 处理记忆中的快速状态乘数 (MultPIM: Fast Stateful Multiplication for Processing-in-Memory) - 专知论文

会员服务 ·

0

可约的 · state-of-the-art · FAST · Performer · Storage ·

2021 年 9 月 20 日

MultPIM: Fast Stateful Multiplication for Processing-in-Memory

翻译：MultPIM: 处理记忆中的快速状态乘数

Orian Leitersdorf,Ronny Ronen,Shahar Kvatinsky

from arxiv, Accepted to IEEE Transactions On Circuits And Systems-II (TCAS-II)

Processing-in-memory (PIM) seeks to eliminate computation/memory data transfer using devices that support both storage and logic. Stateful logic techniques such as IMPLY, MAGIC and FELIX can perform logic gates within memristive crossbar arrays with massive parallelism. Multiplication via stateful logic is an active field of research due to the wide implications. Recently, RIME has become the state-of-the-art algorithm for stateful single-row multiplication by using memristive partitions, reducing the latency of the previous state-of-the-art by 5.1x. In this paper, we begin by proposing novel partition-based computation techniques for broadcasting and shifting data. Then, we design an in-memory multiplication algorithm based on the carry-save add-shift (CSAS) technique. Finally, we develop a novel stateful full-adder that significantly improves the state-of-the-art (FELIX) design. These contributions constitute MultPIM, a multiplier that reduces state-of-the-art time complexity from quadratic to linear-log. For 32-bit numbers, MultPIM improves latency by an additional 4.2x over RIME, while even slightly reducing area overhead. Furthermore, we optimize MultPIM for full-precision matrix-vector multiplication and improve latency by 25.5x over FloatPIM matrix-vector multiplication.

翻译：PIM 试图用支持存储和逻辑的装置消除计算/模拟数据传输。 IMPLY、 MAGIC 和 FELIX 等状态逻辑技术可以在弥漫的跨条形阵列中用大量平行的超线阵列运行逻辑门。由于具有广泛的影响, 光学逻辑的乘法是一个积极的研究领域。最近, RIME 已经成为了使用中间分区进行状态性单行倍增的最先进的算法, 降低了5. 5x 先前状态的静态。在本文中, 我们首先提出基于新颖的基于分区的计算技术, 用于广播和移动数据。然后, 我们设计了一个基于随传加转( CSAS) 技术的模拟倍增算法。最后, 我们开发了一个新的状态全局全局算法, 大大改进了状态( FELIX ) 的矩阵设计。这些贡献构成 mutPIM, 一种超常值的乘数, 降低状态时间复杂性, 从平面的平面平面平面平面平面平面平面平面平面平面平面平面平面。

0

相关内容

可约的

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【斯坦福经典书最新版】语音语言处理，653页pdf

【斯坦福经典书最新版】语音语言处理，653页pdf

专知会员服务

53+阅读 · 2021年1月1日

【斯坦福大学博士论文】用于学习和转换任务表示的计算框架，A computational framework for learning and transforming task representations，166页pdf

【斯坦福大学博士论文】用于学习和转换任务表示的计算框架，A computational framework for learning and transforming task representations，166页pdf

专知会员服务

15+阅读 · 2020年7月3日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

170+阅读 · 2020年4月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

用 NumPy 写一个RNN、LSTM，这可能是最好的入门方式！

用 NumPy 写一个RNN、LSTM，这可能是最好的入门方式！

数说工作室

6+阅读 · 2019年5月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

已删除

创业邦杂志

5+阅读 · 2019年3月27日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

时间序列深度学习：状态 LSTM 模型预测太阳黑子（下）

时间序列深度学习：状态 LSTM 模型预测太阳黑子（下）

R语言中文社区

9+阅读 · 2018年6月15日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Massive MIMO Adaptive Modulation and Coding Using Online Deep Learning Algorithm

Arxiv

0+阅读 · 2021年11月10日

Beyond Tikhonov: Faster Learning with Self-Concordant Losses via Iterative Regularization

Arxiv

0+阅读 · 2021年11月10日

No-Substitution $k$-means Clustering with Optimal Center Complexity and Low Memory

Arxiv

0+阅读 · 2021年11月8日

1xN Block Pattern for Network Sparsity

Arxiv

0+阅读 · 2021年11月8日

New Streaming Algorithms for High Dimensional EMD and MST

Arxiv

0+阅读 · 2021年11月5日

Multiplying Matrices Without Multiplying

Arxiv

9+阅读 · 2021年6月21日

AdderNet: Do We Really Need Multiplications in Deep Learning?

AdderNet: Do We Really Need Multiplications in Deep Learning?

Arxiv

10+阅读 · 2019年12月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

Depth-Gated LSTM

Arxiv

4+阅读 · 2015年8月25日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【斯坦福经典书最新版】语音语言处理，653页pdf

【斯坦福经典书最新版】语音语言处理，653页pdf

专知会员服务

53+阅读 · 2021年1月1日

【斯坦福大学博士论文】用于学习和转换任务表示的计算框架，A computational framework for learning and transforming task representations，166页pdf

【斯坦福大学博士论文】用于学习和转换任务表示的计算框架，A computational framework for learning and transforming task representations，166页pdf

专知会员服务

15+阅读 · 2020年7月3日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

170+阅读 · 2020年4月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

用 NumPy 写一个RNN、LSTM，这可能是最好的入门方式！

用 NumPy 写一个RNN、LSTM，这可能是最好的入门方式！

数说工作室

6+阅读 · 2019年5月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

已删除

创业邦杂志

5+阅读 · 2019年3月27日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

时间序列深度学习：状态 LSTM 模型预测太阳黑子（下）

时间序列深度学习：状态 LSTM 模型预测太阳黑子（下）

R语言中文社区

9+阅读 · 2018年6月15日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

相关论文

Massive MIMO Adaptive Modulation and Coding Using Online Deep Learning Algorithm

Arxiv

0+阅读 · 2021年11月10日

Beyond Tikhonov: Faster Learning with Self-Concordant Losses via Iterative Regularization

Arxiv

0+阅读 · 2021年11月10日

No-Substitution $k$-means Clustering with Optimal Center Complexity and Low Memory

Arxiv

0+阅读 · 2021年11月8日

1xN Block Pattern for Network Sparsity

Arxiv

0+阅读 · 2021年11月8日

New Streaming Algorithms for High Dimensional EMD and MST

Arxiv

0+阅读 · 2021年11月5日

Multiplying Matrices Without Multiplying

Arxiv

9+阅读 · 2021年6月21日

AdderNet: Do We Really Need Multiplications in Deep Learning?

AdderNet: Do We Really Need Multiplications in Deep Learning?

Arxiv

10+阅读 · 2019年12月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

Depth-Gated LSTM

Arxiv

4+阅读 · 2015年8月25日

微信扫码咨询专知VIP会员