GCS:促进高效同步化的普遍缓存一致性 (GCS: Generalized Cache Coherence For Efficient Synchronization) - 专知论文

会员服务 ·

0

层 · cache · Performer · 泛化理论 · 缩放 ·

2023 年 1 月 31 日

GCS: Generalized Cache Coherence For Efficient Synchronization

翻译：GCS:促进高效同步化的普遍缓存一致性

Yanpeng Yu,Seung-seob Lee,Anurag Khandelwal,Lin Zhong

from arxiv, 15 pages, 11 figures, 1 table

We explore the design of scalable synchronization primitives for disaggregated shared memory. Porting existing synchronization primitives to disaggregated shared memory results in poor scalability with the number of application threads because they layer synchronization primitives atop cache-coherence substrates, which engenders redundant inter-core communications. Substantially higher cache-coherence latency ($\mu$s) with substantially lower bandwidths in state-of-the-art disaggregated shared memory designs amplifies the impact of such redundant communications and precludes scalability. In this work, we argue for a co-design for the cache-coherence and synchronization layers for better performance scaling of multi-threaded applications on disaggregated memory. This is driven by our observation that synchronization primitives are essentially a generalization of cache-coherence protocols in time and space. We present GCS as an implementation of this co-design. GCS employs wait queues and arbitrarily-sized cache lines directly at the cache-coherence protocol layer for temporal and spatial generalization. We evaluate GCS against the layered approach for synchronization primitives: the pthread implementation of reader-writer lock, and show that GCS improves in-memory key-value store performance at scale by 1 - 2 orders of magnitude.

翻译：我们探索用于分解共享记忆的可缩放同步原始件的设计。将现有同步原始件移植到分解共享记忆中, 导致与应用线索数量不相称, 因为它们在分解记忆的多读应用程序的性能缩放中将原始件相匹配, 从而导致重复的跨核心通信。大量提高缓存- 一致性( $\ mu$), 在最新分类共享记忆设计中, 带带宽大大较低的缓存- 粘合性( $\ mu$) 放大了这种多余通信的影响, 并排除了可缩放性。在这项工作中, 我们主张对缓存和同步层层进行共同设计, 以更好地缩放分解记忆中多读应用程序的性能缩放。驱动这一设计的原因是, 我们观察到原始件的同步基本上是时间和空间的缓存- 一致协议协议的一般化。 GCS 使用等待队列和任意大小的缓存线直接在缓存协议层进行时间和空间一般化。我们用GCS 来对照分层方法评估同步原始程序: 智能级执行GSlovey- sload- slocal- slodal- sal- sal- sloveyal- lavedal- sal- laveildal- lavedal- sal- sal- lavedal- sal- sal- sal- sal- sal- sal- sal- sal- slodal- sal- sal- slodaldal- sal- smaldal- smaldaldal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal_

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【DeepMind最新报道】一行命令安装mujoco，再也不为环境折腾！[pip install mujoco]

【DeepMind最新报道】一行命令安装mujoco，再也不为环境折腾！[pip install mujoco]

深度强化学习实验室

0+阅读 · 2022年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

超高速CMOS数模转换器关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

MgO-Y2O3红外透明陶瓷制备与微波烧结机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

深空通信中的自适应容错图像编码器实现方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于 In2-2xZnySnxO3 复合氧化物的高性能甲醛气体传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向大规模高性能计算的低开销回卷恢复容错技术

国家自然科学基金

0+阅读 · 2012年12月31日

高性能CPU中动态逻辑电路的低功耗方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于大规模集成光学芯片的新型可调谐锁模光纤激光器的研制

国家自然科学基金

0+阅读 · 2012年12月31日

基于QAM光载毫米波信号的10Gb/s RoF系统关键技术研究

国家自然科学基金

0+阅读 · 2010年12月31日

多金属空心球的电化学催化作用及金属间协同效应的研究

国家自然科学基金

0+阅读 · 2009年12月31日

靶向CEA阳性胰腺癌的治疗性免疫细胞疫苗的研制及实验

国家自然科学基金

0+阅读 · 2008年12月31日

Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster

Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster

Arxiv

0+阅读 · 2023年3月23日

VADER: Video Alignment Differencing and Retrieval

Arxiv

0+阅读 · 2023年3月23日

Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation

Arxiv

0+阅读 · 2023年3月23日

The power and limitations of learning quantum dynamics incoherently

Arxiv

0+阅读 · 2023年3月22日

Exponential Consistency of M-estimators in Generalized Linear Mixed Models

Arxiv

0+阅读 · 2023年3月22日

Highly Efficient Estimators with High Breakdown Point for Linear Models with Structured Covariance Matrices

Arxiv

0+阅读 · 2023年3月21日

Efficient algorithms for Tucker decomposition via approximate matrix multiplication

Arxiv

0+阅读 · 2023年3月21日

Optimal Individualized Treatment Rule for Combination Treatments Under Budget Constraints

Arxiv

0+阅读 · 2023年3月21日

Sandwiched Video Compression: Efficiently Extending the Reach of Standard Codecs with Neural Wrappers

Arxiv

0+阅读 · 2023年3月20日

How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice Based on Over 60 Replicated Studies

Arxiv

0+阅读 · 2023年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【DeepMind最新报道】一行命令安装mujoco，再也不为环境折腾！[pip install mujoco]

【DeepMind最新报道】一行命令安装mujoco，再也不为环境折腾！[pip install mujoco]

深度强化学习实验室

0+阅读 · 2022年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster

Scalability of 3D-DFT by block tensor-matrix multiplication on the JUWELS Cluster

Arxiv

0+阅读 · 2023年3月23日

VADER: Video Alignment Differencing and Retrieval

Arxiv

0+阅读 · 2023年3月23日

Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation

Arxiv

0+阅读 · 2023年3月23日

The power and limitations of learning quantum dynamics incoherently

Arxiv

0+阅读 · 2023年3月22日

Exponential Consistency of M-estimators in Generalized Linear Mixed Models

Arxiv

0+阅读 · 2023年3月22日

Highly Efficient Estimators with High Breakdown Point for Linear Models with Structured Covariance Matrices

Arxiv

0+阅读 · 2023年3月21日

Efficient algorithms for Tucker decomposition via approximate matrix multiplication

Arxiv

0+阅读 · 2023年3月21日

Optimal Individualized Treatment Rule for Combination Treatments Under Budget Constraints

Arxiv

0+阅读 · 2023年3月21日

Sandwiched Video Compression: Efficiently Extending the Reach of Standard Codecs with Neural Wrappers

Arxiv

0+阅读 · 2023年3月20日

How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice Based on Over 60 Replicated Studies

Arxiv

0+阅读 · 2023年3月20日

相关基金

超高速CMOS数模转换器关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

MgO-Y2O3红外透明陶瓷制备与微波烧结机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

深空通信中的自适应容错图像编码器实现方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于 In2-2xZnySnxO3 复合氧化物的高性能甲醛气体传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向大规模高性能计算的低开销回卷恢复容错技术

国家自然科学基金

0+阅读 · 2012年12月31日

高性能CPU中动态逻辑电路的低功耗方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于大规模集成光学芯片的新型可调谐锁模光纤激光器的研制

国家自然科学基金

0+阅读 · 2012年12月31日

基于QAM光载毫米波信号的10Gb/s RoF系统关键技术研究

国家自然科学基金

0+阅读 · 2010年12月31日

多金属空心球的电化学催化作用及金属间协同效应的研究

国家自然科学基金

0+阅读 · 2009年12月31日

靶向CEA阳性胰腺癌的治疗性免疫细胞疫苗的研制及实验

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员