Neat: 低复杂度、高效率的芯片上缓存的一致性 (Neat: Low-Complexity, Efficient On-Chip Cache Coherence) - 专知论文

会员服务 ·

0

Performer · state-of-the-art · SimPLe · 计算机体系架构 ·

2021 年 7 月 12 日

Neat: Low-Complexity, Efficient On-Chip Cache Coherence

翻译：Neat: 低复杂度、高效率的芯片上缓存的一致性

Rui Zhang,Swarnendu Biswas,Vignesh Balaji,Michael D. Bond,Brandon Lucia

Cache coherence protocols such as MESI that use writer-initiated invalidation have high complexity and sometimes have poor performance and energy usage, especially under false sharing. Such protocols require numerous transient states, a shared directory, and support for core-to-core communication, while also suffering under false sharing. An alternative to MESI's writer-initiated invalidation is self-invalidation, which achieves lower complexity than MESI but adds high performance costs or relies on programmer annotations or specific data access patterns. This paper presents Neat, a low-complexity, efficient cache coherence protocol. Neat uses self-invalidation, thus avoiding MESI's transient states, directory, and core-to-core communication requirements. Neat uses novel mechanisms that effectively avoid many unnecessary self-invalidations. An evaluation shows that Neat is simple and has lower verification complexity than the MESI protocol. Neat not only outperforms state-of-the-art self-invalidation protocols, but its performance and energy consumption are comparable to MESI's, and it outperforms MESI under false sharing.

翻译：Cache 一致性协议,例如使用作者发起的无效化的 MOSI 协议,具有高度的复杂性,有时还存在不良的性能和能源使用,特别是在虚假共享的情况下。这类协议要求许多短暂状态、共享目录、支持核心至核心通信,同时在虚假共享下遭受痛苦。 MESI 作者发起的无效化的替代办法是自我验证,其复杂性比MESI 协议要低,但增加了高性能成本或依赖程序说明或特定数据访问模式。本文介绍了Neat, 低兼容性, 高效的缓存一致性协议。 Neat 使用自我验证, 从而避免MESI的短暂状态、目录和核心至核心通信要求。 Neat 使用新的机制, 有效避免许多不必要的自我估值。一项评估表明Neat 简单, 核查复杂性比MESI 协议要低。 Neat 不仅超越了最先进的自证化协议, 其性能和能源消耗量也与MESI 的相似, 而且它比MESI 错误共享。

0

相关内容

Performer

最新《并行编程》，599页pdf

专知会员服务

55+阅读 · 2021年7月21日

【WWW2021】高效的非抽样知识图谱嵌入

专知会员服务

38+阅读 · 2021年4月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

专知会员服务

44+阅读 · 2020年3月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

CCF推荐 | 国际会议信息8条

CCF推荐 | 国际会议信息8条

Call4Papers

9+阅读 · 2019年5月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Towards Practical Integrity in the Smart Home with HomeEndorser

Arxiv

0+阅读 · 2021年9月10日

ECO: Enabling Energy-Neutral IoT Devices through Runtime Allocation of Harvested Energy

Arxiv

0+阅读 · 2021年9月10日

Deep Active Inference for Pixel-Based Discrete Control: Evaluation on the Car Racing Problem

Arxiv

0+阅读 · 2021年9月9日

Maximizing the Set Cardinality of Users Scheduled for Ultra-dense uRLLC Networks

Arxiv

0+阅读 · 2021年9月9日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

Arxiv

9+阅读 · 2018年9月17日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

Nonparametric Topic Modeling with Neural Inference

Arxiv

3+阅读 · 2018年6月18日

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

Arxiv

3+阅读 · 2018年4月10日

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Arxiv

5+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

state-of-the-art

计算机体系架构

相关VIP内容

最新《并行编程》，599页pdf

专知会员服务

55+阅读 · 2021年7月21日

【WWW2021】高效的非抽样知识图谱嵌入

专知会员服务

38+阅读 · 2021年4月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

【图神经网络遇上符号计算】Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

专知会员服务

44+阅读 · 2020年3月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身操作的视觉-语言-动作模型综述

《多域空战指挥体系：驾驭复杂性的艺术》

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

相关资讯

CCF推荐 | 国际会议信息8条

CCF推荐 | 国际会议信息8条

Call4Papers

9+阅读 · 2019年5月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Towards Practical Integrity in the Smart Home with HomeEndorser

Arxiv

0+阅读 · 2021年9月10日

ECO: Enabling Energy-Neutral IoT Devices through Runtime Allocation of Harvested Energy

Arxiv

0+阅读 · 2021年9月10日

Deep Active Inference for Pixel-Based Discrete Control: Evaluation on the Car Racing Problem

Arxiv

0+阅读 · 2021年9月9日

Maximizing the Set Cardinality of Users Scheduled for Ultra-dense uRLLC Networks

Arxiv

0+阅读 · 2021年9月9日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

Arxiv

9+阅读 · 2018年9月17日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

Nonparametric Topic Modeling with Neural Inference

Arxiv

3+阅读 · 2018年6月18日

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

Arxiv

3+阅读 · 2018年4月10日

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Arxiv

5+阅读 · 2018年3月23日

微信扫码咨询专知VIP会员