SSD集群在分布式文件系统中的表现：一项实证研究 (How does SSD Cluster Perform for Distributed File Systems: An Empirical Study) - 专知论文

会员服务 ·

0

簇 · Performer · SSD · 可约的 · Extensibility ·

2023 年 3 月 22 日

How does SSD Cluster Perform for Distributed File Systems: An Empirical Study

翻译：SSD集群在分布式文件系统中的表现：一项实证研究

Jiashu Wu,Yang Wang,Jinpeng Wang,Hekang Wang,Taorui Lin

from arxiv, Accepted by Concurrency and Computation: Practice and Experience

As the capacity of Solid-State Drives (SSDs) is constantly being optimised and boosted with gradually reduced cost, the SSD cluster is now widely deployed as part of the hybrid storage system in various scenarios such as cloud computing and big data processing. However, despite its rapid developments, the performance of the SSD cluster remains largely under-investigated, leaving its sub-optimal applications in reality. To address this issue, in this paper we conduct extensive empirical studies for a comprehensive understanding of the SSD cluster in diverse settings. To this end, we configure a real SSD cluster and gather the generated trace data based on some often-used benchmarks, then adopt analytical methods to analyse the performance of the SSD cluster with different configurations. In particular, regression models are built to provide better performance predictability under broader configurations, and the correlations between influential factors and performance metrics with respect to different numbers of nodes are investigated, which reveal the high scalability of the SSD cluster. Additionally, the cluster's network bandwidth is inspected to explain the performance bottleneck. Finally, the knowledge gained is summarised to benefit the SSD cluster deployment in practice.

翻译：随着固态硬盘（Solid-State Drives）容量不断进行优化和提升，成本逐渐降低，SSD集群现在作为混合存储系统的一部分广泛部署在各种场景中，例如云计算和大数据处理。然而，尽管它的发展迅速，SSD集群的性能仍然很少受到研究，使其在现实中的应用不够优化。为了解决这个问题，在本文中，我们进行了广泛的实证研究，以全面了解SSD集群在不同环境下的表现。为此，我们配置了一个真实的SSD集群，并根据某些常用基准测试获取生成的跟踪数据，然后采用分析方法来分析不同配置下SSD集群的性能。特别地，我们建立回归模型以在更广泛的配置下提供更好的性能预测性，并调查不同节点数下的有影响因素与性能指标之间的相关性，这揭示了SSD集群的高可伸缩性。此外，检查集群网络带宽以解释性能瓶颈。最后，总结得到的知识以有益于实际应用中的SSD集群部署。

0

相关内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

锰对认知功能的影响及其分子作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

云计算中虚拟资源性能度量的指标和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于要素逐步添加GERT网络的非常规突发事件"情景-应对"策略作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式系统预测控制的协调策略与系统综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU性能模型的异构系统优化技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

超多核处理器片上网络性能模型研究

国家自然科学基金

1+阅读 · 2010年12月31日

支持QoS的主动无线传感器网络中间件研究

国家自然科学基金

0+阅读 · 2009年12月31日

交通行为非均衡演化与干预对策研究

国家自然科学基金

0+阅读 · 2008年12月31日

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

Arxiv

0+阅读 · 2023年5月12日

Efficient Discovery of Heterogeneous Quantile Treatment Effects in Randomized Experiments via Anomalous Pattern Detection

Arxiv

0+阅读 · 2023年5月10日

An Empirical Study on How the Developers Discussed about Pandas Topics

Arxiv

0+阅读 · 2023年5月10日

Cross-Study Replicability in Cluster Analysis

Arxiv

0+阅读 · 2023年5月9日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACMMM2025教程】打击网络虚假信息视频：特征分析、检测与防范，170页ppt

海军无人系统：海上作战的演进而非革命

Nature 子刊 | SciToolAgent:知识图谱引导的科学工具智能体

多媒体顶会ACM Multimedia 2025各大奖项揭晓！格拉斯哥大学等获最佳论文，中科院自动化所等获最佳学生论文

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

Arxiv

0+阅读 · 2023年5月12日

Efficient Discovery of Heterogeneous Quantile Treatment Effects in Randomized Experiments via Anomalous Pattern Detection

Arxiv

0+阅读 · 2023年5月10日

An Empirical Study on How the Developers Discussed about Pandas Topics

Arxiv

0+阅读 · 2023年5月10日

Cross-Study Replicability in Cluster Analysis

Arxiv

0+阅读 · 2023年5月9日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

相关基金

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

锰对认知功能的影响及其分子作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

云计算中虚拟资源性能度量的指标和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于要素逐步添加GERT网络的非常规突发事件"情景-应对"策略作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式系统预测控制的协调策略与系统综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU性能模型的异构系统优化技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

超多核处理器片上网络性能模型研究

国家自然科学基金

1+阅读 · 2010年12月31日

支持QoS的主动无线传感器网络中间件研究

国家自然科学基金

0+阅读 · 2009年12月31日

交通行为非均衡演化与干预对策研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员