GPU 快速门级电力改善模拟 (GATSPI: GPU Accelerated Gate-Level Simulation for Power Improvement) - 专知论文

会员服务 ·

0

核化 · GPU · 复杂细胞 · 估计/估计量 · EASE ·

2022 年 3 月 11 日

GATSPI: GPU Accelerated Gate-Level Simulation for Power Improvement

翻译：GPU 快速门级电力改善模拟

Yanqing Zhang,Haoxing Ren,Akshay Sridharan,Brucek Khailany

In this paper, we present GATSPI, a novel GPU accelerated logic gate simulator that enables ultra-fast power estimation for industry sized ASIC designs with millions of gates. GATSPI is written in PyTorch with custom CUDA kernels for ease of coding and maintainability. It achieves simulation kernel speedup of up to 1668X on a single-GPU system and up to 7412X on a multiple-GPU system when compared to a commercial gate-level simulator running on a single CPU core. GATSPI supports a range of simple to complex cell types from an industry standard cell library and SDF conditional delay statements without requiring prior calibration runs and produces industry-standard SAIF files from delay-aware gate-level simulation. Finally, we deploy GATSPI in a glitch-optimization flow, achieving a 1.4% power saving with a 449X speedup in turnaround time compared to a similar flow using a commercial simulator.

翻译：在本文中,我们展示了新型的GPU加速逻辑门模拟器,这是一个新型的GPU加速逻辑模拟器,能够对具有数百万门的工业规模的ASIC设计进行超快功率估计;为便于编码和维护,HSPPI与CUDA定制内核以PyTorrch编写,在单一GPU系统上实现模拟内核加速达1668X,在多式GPU系统上实现模拟内核加速,与在单一CPU核心上运行的商业门级模拟器相比达到7412X;与使用商业门级模拟器进行的商业门级模拟器相比,HSPI支持行业标准细胞库和SDF有条件延迟声明的一系列简单到复杂的细胞类型,而无需事先校准运行,并从延迟-觉醒门级模拟中生成工业标准的SAIF文件;最后,我们将GLitch-Oppim流程中实现1.4%的节能节能,在周转时间内实现4.4%的节能,而使用商业模拟器进行类似的流动。

0

相关内容

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

非局部Schrödinger方程的高效守恒算法

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

星形胶质细胞内源性PLD正性调控树突的发育

国家自然科学基金

0+阅读 · 2013年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

多GPU并行的热/化学反应非平衡N-S方程求解算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Legendre 级数多极边界元法理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

基于GPU/CPU协同计算的城市建筑群震害模拟

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于并行微种群遗传算法的变密度地下水模拟优化模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

LIGHTYEAR: Using Modularity to Scale BGP Control Plane Verification

Arxiv

0+阅读 · 2022年4月20日

CPU- and GPU-based Distributed Sampling in Dirichlet Process Mixtures for Large-scale Analysis

CPU- and GPU-based Distributed Sampling in Dirichlet Process Mixtures for Large-scale Analysis

Arxiv

0+阅读 · 2022年4月19日

Differentiable Time-Frequency Scattering in Kymatio

Differentiable Time-Frequency Scattering in Kymatio

Arxiv

0+阅读 · 2022年4月19日

Machine learning method for light field refocusing

Arxiv

0+阅读 · 2022年4月18日

Majorization Minimization Methods for Distributed Pose Graph Optimization

Majorization Minimization Methods for Distributed Pose Graph Optimization

Arxiv

0+阅读 · 2022年4月18日

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

Arxiv

1+阅读 · 2022年4月18日

Characterizing and Understanding Distributed GNN Training on GPUs

Arxiv

1+阅读 · 2022年4月18日

BLEWhisperer: Exploiting BLE Advertisements for Data Exfiltration

Arxiv

0+阅读 · 2022年4月17日

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

Arxiv

0+阅读 · 2022年4月15日

Multiplier with Reduced Activities and Minimized Interconnect for Inner Product Arrays

Arxiv

0+阅读 · 2022年4月11日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《现代化战役与作战规划：陆军的未来之路》最新101页

《理解Link 16：军事通信的支柱——探索战术数据交换网络》

《人工智能在军事行动作战规划过程中的应用可能性》

《洞穴环境无线电传播建模》147页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

相关论文

LIGHTYEAR: Using Modularity to Scale BGP Control Plane Verification

Arxiv

0+阅读 · 2022年4月20日

CPU- and GPU-based Distributed Sampling in Dirichlet Process Mixtures for Large-scale Analysis

CPU- and GPU-based Distributed Sampling in Dirichlet Process Mixtures for Large-scale Analysis

Arxiv

0+阅读 · 2022年4月19日

Differentiable Time-Frequency Scattering in Kymatio

Differentiable Time-Frequency Scattering in Kymatio

Arxiv

0+阅读 · 2022年4月19日

Machine learning method for light field refocusing

Arxiv

0+阅读 · 2022年4月18日

Majorization Minimization Methods for Distributed Pose Graph Optimization

Majorization Minimization Methods for Distributed Pose Graph Optimization

Arxiv

0+阅读 · 2022年4月18日

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

Arxiv

1+阅读 · 2022年4月18日

Characterizing and Understanding Distributed GNN Training on GPUs

Arxiv

1+阅读 · 2022年4月18日

BLEWhisperer: Exploiting BLE Advertisements for Data Exfiltration

Arxiv

0+阅读 · 2022年4月17日

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

Arxiv

0+阅读 · 2022年4月15日

Multiplier with Reduced Activities and Minimized Interconnect for Inner Product Arrays

Arxiv

0+阅读 · 2022年4月11日

相关基金

非局部Schrödinger方程的高效守恒算法

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

星形胶质细胞内源性PLD正性调控树突的发育

国家自然科学基金

0+阅读 · 2013年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

多GPU并行的热/化学反应非平衡N-S方程求解算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Legendre 级数多极边界元法理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

基于GPU/CPU协同计算的城市建筑群震害模拟

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于并行微种群遗传算法的变密度地下水模拟优化模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员