Clio: 硬件-软软件共同指定拆分内存系统 (Clio: A Hardware-Software Co-Designed Disaggregated Memory System) - 专知论文

会员服务 ·

0

Processing（编程语言） · 结点 · Networking · Better · prototype ·

2021 年 8 月 15 日

Clio: A Hardware-Software Co-Designed Disaggregated Memory System

翻译：Clio: 硬件-软软件共同指定拆分内存系统

Zhiyuan Guo,Yizhou Shan,Xuhao Luo,Yutong Huang,Yiying Zhang

from arxiv, 17 pages, 19 figures

Memory disaggregation has attracted great attention recently because of its benefits in efficient memory utilization and ease of management. So far, memory disaggregation research has all taken one of two approaches, building/emulating memory nodes with either regular servers or raw memory devices with no processing power. The former incurs higher monetary cost and face tail latency and scalability limitations, while the latter introduce performance, security, and management problems. Server-based memory nodes and memory nodes with no processing power are two extreme approaches. We seek a sweet spot in the middle by proposing a hardware-based memory disaggregation solution that has the right amount of processing power at memory nodes. Furthermore, we take a clean-slate approach by starting from the requirements of memory disaggregation and designing a memory-disaggregation-native system. We propose a hardware-based disaggregated memory system, Clio, that virtualizes and manages disaggregated memory at the memory node. Clio includes a new hardware-based virtual memory system, a customized network system, and a framework for computation offloading. In building Clio, we not only co-design OS functionalities, hardware architecture, and the network system, but also co-design the compute node and memory node. We prototyped Clio's memory node with FPGA and implemented its client-node functionalities in a user-space library. Clio achieves 100 Gbps throughput and an end-to-end latency of 2.5 us at median and 3.2 us at the 99th percentile. Clio scales much better and has orders of magnitude lower tail latency than RDMA, and it has 1.1x to 3.4x energy saving compared to CPU-based and SmartNIC-based disaggregated memory systems and is 2.7x faster than software-based SmartNIC solutions.

翻译：内存分解最近引起极大关注, 因为它对有效记忆利用和易于管理的好处。到目前为止, 内存分解研究已经采取了两种方法之一, 即用正常服务器或没有处理电的原始内存装置来建立/ 模拟内存节点。前一种产生更高的货币成本, 并面临尾部延缩和缩缩缩限制, 而后一种则引入了性能、安全和管理问题。基于服务器的内存节点和内存节点是两种极端的办法。我们通过提出基于硬件的内存分解办法, 寻求中间的甜蜜点。我们寻求一种基于硬件的内存分解办法, 在内存节点上处理的电量是正确的。此外, 我们采用清洁式的内存节点方法, 从内存分解要求开始, 设计一个内存分解的内存系统, Clio, 在内存节点中, C- 内存流流流流流流系统比 C- 内存系统要快得多, 网络内存系统比 C- 内存和内存系统要多。

0

相关内容

Processing（编程语言）

Processing（编程语言）

Processing 是一门开源编程语言和与之配套的集成开发环境（IDE）的名称。Processing 在电子艺术和视觉设计社区被用来教授编程基础，并运用于大量的新媒体和互动艺术作品中。

图计算加速架构综述

图计算加速架构综述

专知会员服务

51+阅读 · 2021年4月5日

【AAAI2021】数据增强图神经网络

专知会员服务

108+阅读 · 2020年12月21日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

专知会员服务

60+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | COLT 2019等国际会议信息9条

人工智能 | COLT 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年9月21日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

多轮对话之对话管理：Dialog Management

多轮对话之对话管理：Dialog Management

PaperWeekly

18+阅读 · 2018年1月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Privacy-Preserving Phishing Email Detection Based on Federated Learning and LSTM

Privacy-Preserving Phishing Email Detection Based on Federated Learning and LSTM

Arxiv

0+阅读 · 2021年10月12日

Energy-cost aware off-grid base stations with IoT devices for developing a green heterogeneous network

Arxiv

0+阅读 · 2021年10月12日

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training

Arxiv

0+阅读 · 2021年10月11日

Socioeconomic Impact of Emerging Mobility Markets and Implementation Strategies

Arxiv

0+阅读 · 2021年10月10日

From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness

Arxiv

0+阅读 · 2021年10月7日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Imbalance Problems in Object Detection: A Review

Arxiv

24+阅读 · 2020年3月11日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

19+阅读 · 2019年11月20日

MARS: Memory Attention-Aware Recommender System

Arxiv

6+阅读 · 2018年5月18日

Finding ReMO (Related Memory Object): A Simple Neural Architecture for Text based Reasoning

Arxiv

4+阅读 · 2018年1月26日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

图计算加速架构综述

图计算加速架构综述

专知会员服务

51+阅读 · 2021年4月5日

【AAAI2021】数据增强图神经网络

专知会员服务

108+阅读 · 2020年12月21日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

专知会员服务

60+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | COLT 2019等国际会议信息9条

人工智能 | COLT 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年9月21日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

多轮对话之对话管理：Dialog Management

多轮对话之对话管理：Dialog Management

PaperWeekly

18+阅读 · 2018年1月15日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Privacy-Preserving Phishing Email Detection Based on Federated Learning and LSTM

Privacy-Preserving Phishing Email Detection Based on Federated Learning and LSTM

Arxiv

0+阅读 · 2021年10月12日

Energy-cost aware off-grid base stations with IoT devices for developing a green heterogeneous network

Arxiv

0+阅读 · 2021年10月12日

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training

Arxiv

0+阅读 · 2021年10月11日

Socioeconomic Impact of Emerging Mobility Markets and Implementation Strategies

Arxiv

0+阅读 · 2021年10月10日

From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness

Arxiv

0+阅读 · 2021年10月7日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Imbalance Problems in Object Detection: A Review

Arxiv

24+阅读 · 2020年3月11日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

19+阅读 · 2019年11月20日

MARS: Memory Attention-Aware Recommender System

Arxiv

6+阅读 · 2018年5月18日

Finding ReMO (Related Memory Object): A Simple Neural Architecture for Text based Reasoning

Arxiv

4+阅读 · 2018年1月26日

微信扫码咨询专知VIP会员