MANA: 将指令进行微编 (MANA: Microarchitecting an Instruction Prefetcher) - 专知论文

会员服务 ·

0

Storage · Performer · 可约的 · 代价 · SimPLe ·

2021 年 2 月 2 日

MANA: Microarchitecting an Instruction Prefetcher

翻译：MANA: 将指令进行微编

Ali Ansari,Fatemeh Golshan,Pejman Lotfi-Kamran,Hamid Sarbazi-Azad

from arxiv, 24 pages with 15 figures

L1 instruction (L1-I) cache misses are a source of performance bottleneck. Sequential prefetchers are simple solutions to mitigate this problem; however, prior work has shown that these prefetchers leave considerable potentials uncovered. This observation has motivated many researchers to come up with more advanced instruction prefetchers. In 2011, Proactive Instruction Fetch (PIF) showed that a hardware prefetcher could effectively eliminate all of the instruction-cache misses. However, its enormous storage cost makes it an impractical solution. Consequently, reducing the storage cost was the main research focus in the instruction prefetching in the past decade. Several instruction prefetchers, including RDIP and Shotgun, were proposed to offer PIF-level performance with significantly lower storage overhead. However, our findings show that there is a considerable performance gap between these proposals and PIF. While these proposals use different mechanisms for instruction prefetching, the performance gap is largely not because of the mechanism, and instead, is due to not having sufficient storage. Prior proposals suffer from one or both of the following shortcomings: (1) a large number of metadata records to cover the potential, and (2) a high storage cost of each record. The first problem causes metadata miss, and the second problem prohibits the prefetcher from storing enough records within reasonably-sized storage.

翻译：L1 指令( L1- I) 缓冲误差是性能瓶颈的一个来源。序列预发器是缓解这一问题的简单解决方案; 但是, 先前的工作表明, 这些预发器留下了相当大的潜力。这一观察促使许多研究人员提出了更先进的预发件器。 2011年, 预发式指令( PIF) 显示, 硬件预发器可以有效消除所有教缓漏, 然而, 其巨大的存储成本使得它成为一个不切实际的解决方案。因此, 降低存储成本是过去十年来指令预发中的主要研究焦点。一些预发器, 包括RDIP和Shotgun, 提议提供PIF级的性能, 并大大降低存储间接费用。然而, 我们的研究结果表明,这些提案与PIFT之间存在相当大的绩效差距。虽然这些提案使用不同的指令预发漏机制, 但绩效差距在很大程度上不是由于这一机制,而是由于没有充足的存储。之前的建议存在以下两个缺陷:(1) 大量元数据记录, 并且每个存储记录中都有相当大的一个缺陷。

0

相关内容

Storage

Storage

【2020新书】使用R和Python的高级BI分析，425页pdf

【2020新书】使用R和Python的高级BI分析，425页pdf

专知会员服务

35+阅读 · 2020年10月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | 中低难度国际会议信息6条

计算机 | 中低难度国际会议信息6条

Call4Papers

7+阅读 · 2019年5月16日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

A Fast and Small Subsampled R-index

Arxiv

0+阅读 · 2021年3月29日

Private and Resource-Bounded Locally Decodable Codes for Insertions and Deletions

Arxiv

0+阅读 · 2021年3月29日

Data Privacy in Trigger-Action IoT Systems

Arxiv

0+阅读 · 2021年3月28日

UNIT: Unifying Tensorized Instruction Compilation

Arxiv

0+阅读 · 2021年3月28日

Reducing Load Latency with Cache Level Prediction

Arxiv

0+阅读 · 2021年3月27日

On the Complexity of the CSG Tree Extraction Problem

Arxiv

0+阅读 · 2021年3月27日

Model-based Reconstruction with Learning: From Unsupervised to Supervised and Beyond

Model-based Reconstruction with Learning: From Unsupervised to Supervised and Beyond

Arxiv

0+阅读 · 2021年3月26日

Infinity: A Scalable Infrastructure for In-Network Applications

Arxiv

0+阅读 · 2021年3月26日

Private and Resource-Bounded Locally Decodalbe Codes for Insertions and Deletions

Arxiv

0+阅读 · 2021年3月25日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

VIP会员

文章信息

相关主题

相关VIP内容

【2020新书】使用R和Python的高级BI分析，425页pdf

【2020新书】使用R和Python的高级BI分析，425页pdf

专知会员服务

35+阅读 · 2020年10月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | 中低难度国际会议信息6条

计算机 | 中低难度国际会议信息6条

Call4Papers

7+阅读 · 2019年5月16日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | ISMAR 2019等国际会议信息8条

计算机 | ISMAR 2019等国际会议信息8条

Call4Papers

3+阅读 · 2019年3月5日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

A Fast and Small Subsampled R-index

Arxiv

0+阅读 · 2021年3月29日

Private and Resource-Bounded Locally Decodable Codes for Insertions and Deletions

Arxiv

0+阅读 · 2021年3月29日

Data Privacy in Trigger-Action IoT Systems

Arxiv

0+阅读 · 2021年3月28日

UNIT: Unifying Tensorized Instruction Compilation

Arxiv

0+阅读 · 2021年3月28日

Reducing Load Latency with Cache Level Prediction

Arxiv

0+阅读 · 2021年3月27日

On the Complexity of the CSG Tree Extraction Problem

Arxiv

0+阅读 · 2021年3月27日

Model-based Reconstruction with Learning: From Unsupervised to Supervised and Beyond

Model-based Reconstruction with Learning: From Unsupervised to Supervised and Beyond

Arxiv

0+阅读 · 2021年3月26日

Infinity: A Scalable Infrastructure for In-Network Applications

Arxiv

0+阅读 · 2021年3月26日

Private and Resource-Bounded Locally Decodalbe Codes for Insertions and Deletions

Arxiv

0+阅读 · 2021年3月25日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

微信扫码咨询专知VIP会员