CAP-RAM: 用于准确和精确、可编程的有线电视新闻网推理的 CAP-RAM: (CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference) - 专知论文

会员服务 ·

0

IMC · 卷积神经网络 · 推断 · 模型评估 · 线性的 ·

2021 年 7 月 6 日

CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

翻译：CAP-RAM: 用于准确和精确、可编程的有线电视新闻网推理的 CAP-RAM:

Zhiyu Chen,Zhanghao Yu,Qing Jin,Yan He,Jingyu Wang,Sheng Lin,Dai Li,Yanzhi Wang,Kaiyuan Yang

from arxiv, This work has been accepted by IEEE Journal of Solid-State Circuits (JSSC 2021)

A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference. It leverages a novel charge-domain multiply-and-accumulate (MAC) mechanism and circuitry to achieve superior linearity under process variations compared to conventional IMC designs. The adopted semi-parallel architecture efficiently stores filters from multiple CNN layers by sharing eight standard 6T SRAM cells with one charge-domain MAC circuit. Moreover, up to six levels of bit-width of weights with two encoding schemes and eight levels of input activations are supported. A 7-bit charge-injection SAR (ciSAR) analog-to-digital converter (ADC) getting rid of sample and hold (S&H) and input/reference buffers further improves the overall energy efficiency and throughput. A 65-nm prototype validates the excellent linearity and computing accuracy of CAP-RAM. A single 512x128 macro stores a complete pruned and quantized CNN model to achieve 98.8% inference accuracy on the MNIST data set and 89.0% on the CIFAR-10 data set, with a 573.4-giga operations per second (GOPS) peak throughput and a 49.4-tera operations per second (TOPS)/W energy efficiency.

翻译：在节能进化神经网络(CNN)的推论中,介绍了一个叫CAP-RAM(CAP-RAM)的精密、精确和比特维的静态随机存取存储器(IMC)(静态随机存取存储器)宏,称为CAP-RAM(SRAM),用于高能效的进化神经网络(CNN)的推论。它利用一种新型的充电多元和累积(MAC)机制和电路,在与常规的IMC设计相比的流程变异下实现超强线性。所采用的半平行结构通过共享8个标准的6T SRAM(静态的)静态随机存存储器(SRAM),一个充电多管回路路。此外,还支持高达6个位比维重的重量比特(Bitwithwi-wima-cal),一个512x的IMIS-RAM(IS-RAM)运行第518次的精确度,一个IMIS-10的IMSA(IS-10)完整的IMIS-O(S-I-I-I)运行第98-10号,一个完整的SIMIS-IMIS-I-10号运行,一个98-IMIS-I-IMIS-I-10号的完整数据全的精确的528(S-IMIS-IMIS-I-I-I-I-I-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-Q-PQ-PQ),一个完整的精确度,一个完整的精确度数据数据集,一个完整的完整的运行,一个完整的精确度,一个完整的运行到一个全的512-Q-PIAS-Q-Q-PI-Q-Q-Q-Q-Q-PQ-P-Q-Q-Q-P-Q-I-Q-Q-Q-Q-P-P-IFAS-IAS-I-IAS-IAS-IAS-I-I-8个运行,一个完整的完整的完整的完整的完整的完整的完整的运行的精确的精确的精确的运行的运行的运行的精确的精确的精确的运行的运行的运行的固定的固定的运行的固定的

0

相关内容

IMC

IMC：Internet Measurement Conference。 Explanation：互联网测量会议。 Publisher：ACM/USENIX。 SIT： http://dblp.uni-trier.de/db/conf/imc/

基于粗粒度数据流架构的稀疏卷积神经网络加速

专知会员服务

23+阅读 · 2021年7月15日

如何加速深度神经网络计算效率？看NVIDIA-ISSCC2021教程，附Slides与视频

如何加速深度神经网络计算效率？看NVIDIA-ISSCC2021教程，附Slides与视频

专知会员服务

34+阅读 · 2021年3月25日

【CVPR2021】加法器神经网络（AdderNet）单图像超分辨率

专知会员服务

18+阅读 · 2021年3月16日

【ICLR2021】基于动态正则化的联邦学习

专知会员服务

42+阅读 · 2021年1月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

168+阅读 · 2020年4月26日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

专知会员服务

11+阅读 · 2019年11月14日

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

专知会员服务

16+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

用Transformer完全替代CNN

用Transformer完全替代CNN

CVer

20+阅读 · 2020年10月23日

已删除

将门创投

6+阅读 · 2019年6月10日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ECCV2018|ShuffleNetV2：轻量级CNN网络中的桂冠

ECCV2018|ShuffleNetV2：轻量级CNN网络中的桂冠

极市平台

7+阅读 · 2018年11月5日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

开发 | Facebook开源 PyTorch版 fairseq，准确性最高、速度比循环神经网络快9倍

开发 | Facebook开源 PyTorch版 fairseq，准确性最高、速度比循环神经网络快9倍

AI科技评论

6+阅读 · 2017年9月19日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

Arxiv

0+阅读 · 2021年9月8日

Parallel Algorithms for Tensor Train Arithmetic

Arxiv

0+阅读 · 2021年9月7日

Bit Density Based Signal and Jamming Detection in 1-Bit Quantized MIMO Systems

Arxiv

0+阅读 · 2021年9月6日

CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade

Arxiv

0+阅读 · 2021年9月3日

Optimizing the Energy Efficiency of Unreliable Memories for Quantized Kalman Filtering

Arxiv

0+阅读 · 2021年9月3日

SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-based Systolic CNN Accelerators

Arxiv

0+阅读 · 2021年9月3日

On the Accuracy of Analog Neural Network Inference Accelerators

Arxiv

0+阅读 · 2021年9月3日

Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks

Arxiv

0+阅读 · 2021年8月21日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

VIP会员

文章信息

相关主题

卷积神经网络

相关VIP内容

基于粗粒度数据流架构的稀疏卷积神经网络加速

专知会员服务

23+阅读 · 2021年7月15日

如何加速深度神经网络计算效率？看NVIDIA-ISSCC2021教程，附Slides与视频

如何加速深度神经网络计算效率？看NVIDIA-ISSCC2021教程，附Slides与视频

专知会员服务

34+阅读 · 2021年3月25日

【CVPR2021】加法器神经网络（AdderNet）单图像超分辨率

专知会员服务

18+阅读 · 2021年3月16日

【ICLR2021】基于动态正则化的联邦学习

专知会员服务

42+阅读 · 2021年1月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【微众银行】联邦学习白皮书_v2.0，48页pdf，

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

168+阅读 · 2020年4月26日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

专知会员服务

11+阅读 · 2019年11月14日

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

专知会员服务

16+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】行动，规划与学习，622页pdf

美军坦克部队反无人机新策略：主炮轰击方案

【ICML2025】免费的Fisher？通过回收平方梯度累加器近似Fisher信息矩阵

数据质量维度的实践展开：一项综述

相关资讯

用Transformer完全替代CNN

用Transformer完全替代CNN

CVer

20+阅读 · 2020年10月23日

已删除

将门创投

6+阅读 · 2019年6月10日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ECCV2018|ShuffleNetV2：轻量级CNN网络中的桂冠

ECCV2018|ShuffleNetV2：轻量级CNN网络中的桂冠

极市平台

7+阅读 · 2018年11月5日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

开发 | Facebook开源 PyTorch版 fairseq，准确性最高、速度比循环神经网络快9倍

开发 | Facebook开源 PyTorch版 fairseq，准确性最高、速度比循环神经网络快9倍

AI科技评论

6+阅读 · 2017年9月19日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

Arxiv

0+阅读 · 2021年9月8日

Parallel Algorithms for Tensor Train Arithmetic

Arxiv

0+阅读 · 2021年9月7日

Bit Density Based Signal and Jamming Detection in 1-Bit Quantized MIMO Systems

Arxiv

0+阅读 · 2021年9月6日

CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade

Arxiv

0+阅读 · 2021年9月3日

Optimizing the Energy Efficiency of Unreliable Memories for Quantized Kalman Filtering

Arxiv

0+阅读 · 2021年9月3日

SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-based Systolic CNN Accelerators

Arxiv

0+阅读 · 2021年9月3日

On the Accuracy of Analog Neural Network Inference Accelerators

Arxiv

0+阅读 · 2021年9月3日

Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks

Arxiv

0+阅读 · 2021年8月21日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

微信扫码咨询专知VIP会员