以神经为基础的斯帕克数据分析分析分析分析模型 (Neural-based Modeling for Performance Tuning of Spark Data Analytics) - 专知论文

会员服务 ·

0

Performer · tuning · MoDELS · Spark · Extensibility ·

2021 年 1 月 20 日

Neural-based Modeling for Performance Tuning of Spark Data Analytics

翻译：以神经为基础的斯帕克数据分析分析分析分析模型

Khaled Zaouk,Fei Song,Chenghao Lyu,Yanlei Diao

Cloud data analytics has become an integral part of enterprise business operations for data-driven insight discovery. Performance modeling of cloud data analytics is crucial for performance tuning and other critical operations in the cloud. Traditional modeling techniques fail to adapt to the high degree of diversity in workloads and system behaviors in this domain. In this paper, we bring recent Deep Learning techniques to bear on the process of automated performance modeling of cloud data analytics, with a focus on Spark data analytics as representative workloads. At the core of our work is the notion of learning workload embeddings (with a set of desired properties) to represent fundamental computational characteristics of different jobs, which enable performance prediction when used together with job configurations that control resource allocation and other system knobs. Our work provides an in-depth study of different modeling choices that suit our requirements. Results of extensive experiments reveal the strengths and limitations of different modeling methods, as well as superior performance of our best performing method over a state-of-the-art modeling tool for cloud analytics.

翻译：云层数据分析器已成为企业业务活动中数据驱动洞察发现的一个有机组成部分。云层数据分析器的性能模型化对于在云层中进行性能调适和其他关键操作至关重要。传统的模型化技术无法适应这一领域工作量和系统行为高度多样性。在本文中,我们带来了最近的深学习技术,用于云层数据分析器自动性能模型化过程,重点是将闪烁数据分析器作为代表性工作量。我们工作的核心是学习工作量嵌入(有一套预期特性)的概念,以代表不同工作的基本计算特征,从而能够在与控制资源分配和其他系统 knobs 的工作配置一起使用时进行性能预测。我们的工作对适合我们要求的不同模型选择进行了深入的研究。广泛的实验结果揭示了不同模型方法的优点和局限性,以及我们最佳表现方法优于云层分析的状态模型工具。

0

相关内容

Performer

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

【新书】用Python六步掌握机器学习，第二版，469页pdf，使用Python进行预测数据分析的实用实现指南Mastering Machine Learning with Python in Six Steps, 2nd Edition A Practical Implementation Guide to Predictive Data Analytics Using Python

【新书】用Python六步掌握机器学习，第二版，469页pdf，使用Python进行预测数据分析的实用实现指南Mastering Machine Learning with Python in Six Steps, 2nd Edition A Practical Implementation Guide to Predictive Data Analytics Using Python

专知会员服务

88+阅读 · 2020年2月2日

【课程推荐】CMPUT 651: Topics in Artificial Intelligence--Deep Learning for NLP

【课程推荐】CMPUT 651: Topics in Artificial Intelligence--Deep Learning for NLP

专知会员服务

20+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

【电子书推荐】Data Science with Python and Dask

【电子书推荐】Data Science with Python and Dask

专知会员服务

44+阅读 · 2019年6月1日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines

MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines

Arxiv

0+阅读 · 2021年3月16日

Distributed Deep Learning Using Volunteer Computing-Like Paradigm

Arxiv

0+阅读 · 2021年3月16日

Deep Learning for Efficient Reconstruction of High-Resolution Turbulent DNS Data

Arxiv

0+阅读 · 2021年3月15日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

A generic framework for privacy preserving deep learning

Arxiv

6+阅读 · 2018年11月13日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

VIP会员

文章信息

相关主题

相关VIP内容

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

【新书】用Python六步掌握机器学习，第二版，469页pdf，使用Python进行预测数据分析的实用实现指南Mastering Machine Learning with Python in Six Steps, 2nd Edition A Practical Implementation Guide to Predictive Data Analytics Using Python

【新书】用Python六步掌握机器学习，第二版，469页pdf，使用Python进行预测数据分析的实用实现指南Mastering Machine Learning with Python in Six Steps, 2nd Edition A Practical Implementation Guide to Predictive Data Analytics Using Python

专知会员服务

88+阅读 · 2020年2月2日

【课程推荐】CMPUT 651: Topics in Artificial Intelligence--Deep Learning for NLP

【课程推荐】CMPUT 651: Topics in Artificial Intelligence--Deep Learning for NLP

专知会员服务

20+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

【电子书推荐】Data Science with Python and Dask

【电子书推荐】Data Science with Python and Dask

专知会员服务

44+阅读 · 2019年6月1日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】语义提示扩散变换器的像素级精确深度估计

俄乌冲突的地缘政治与军事教训（万字长文）

【博士论文】弥合多模态基础模型与世界模型之间的鸿沟

量子增强计算机视觉：超越经典算法

相关资讯

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

相关论文

MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines

MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines

Arxiv

0+阅读 · 2021年3月16日

Distributed Deep Learning Using Volunteer Computing-Like Paradigm

Arxiv

0+阅读 · 2021年3月16日

Deep Learning for Efficient Reconstruction of High-Resolution Turbulent DNS Data

Arxiv

0+阅读 · 2021年3月15日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

A generic framework for privacy preserving deep learning

Arxiv

6+阅读 · 2018年11月13日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

微信扫码咨询专知VIP会员