Exoshuffle: 应用级的大型打碎 (Exoshuffle: Large-Scale Shuffle at the Application Level) - 专知论文

会员服务 ·

0

Shuffle · CASES · Integration · Processing（编程语言） · Performer ·

2023 年 1 月 20 日

Exoshuffle: Large-Scale Shuffle at the Application Level

翻译：Exoshuffle: 应用级的大型打碎

Frank Sifei Luan,Stephanie Wang,Samyukta Yagati,Sean Kim,Kenneth Lien,Isaac Ong,Tony Hong,SangBin Cho,Eric Liang,Ion Stoica

Shuffle is a key primitive in large-scale data processing applications that has inspired a myriad of implementations. While previous work has produced breakthroughs in shuffle performance, many applications do not benefit in practice because of the difficulty of evolving existing shuffle systems. Shuffle is often tightly integrated into a framework that offers a higher-level abstraction such as SQL. Integrating new shuffle designs into these frameworks requires significant development effort. Furthermore, distributed shuffle is used by many different end use cases, from high-throughput batch processing to low-latency online aggregation. These different use cases have driven the creation of new application frameworks, each of which must rebuild shuffle from scratch. We enable shuffle flexibility by building distributed shuffle as a library. We use distributed futures as an intermediate layer for building distributed shuffle as a library and show how it enables the shuffle control plane to be decoupled from a common high-performance data plane based on Ray. We present Exoshuffle and show that we can: (1) rewrite previous shuffle optimizations as application-level libraries with an order of magnitude less code, (2) build a shuffle-agnostic data plane that provides performance and scalability competitive with specialized shuffle systems, and (3) enable latest applications such as ML training to easily leverage large-scale distributed shuffle.

翻译：在大规模数据处理应用程序中,洗牌是一个关键的原始程序,它激励了许多执行。虽然以前的工作在洗牌性能方面产生了突破,但许多应用实际上并没有受益,因为现有洗牌系统发展起来很困难。洗牌往往被紧密地纳入一个框架,这个框架提供更高层次的抽象,如SQL。将新的洗牌设计纳入这些框架需要大量的开发努力。此外,分散的洗牌被许多不同的终端使用,从高通量批量处理到低延迟在线汇总。这些不同的使用案例驱动了新应用框架的创建,每个新应用框架都必须从零开始重建洗牌。我们通过将分配的洗牌作为图书馆来提供洗牌灵活性。我们把分配的未来作为一个中间层,用来建造分配的洗牌作为图书馆,并展示它如何使洗牌控制平面与基于Ray的通用高性数据平面脱钩。我们介绍了Exsoshofffle,并展示了我们能够:(1) 将先前的洗牌优化改写成应用程序级库库,其具有最高级性能、最高级性、最高级的平面化的平面系统。

0

相关内容

Shuffle

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

单分子拉曼散射过程非线性与相干性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

AEG-1对外周T细胞淋巴瘤恶性生物学行为的影响及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

蟾毒灵靶向糖酵解途径抗STRO1+/CD117+骨肉瘤肿瘤干细胞机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙离子对有机物形成正渗透膜污染的影响机制及调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于压力特征的变压器数字式非电量保护研究

国家自然科学基金

0+阅读 · 2012年12月31日

hsBAFF上调胞内钙离子激活B淋巴细胞的信号转导网络免疫调控机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

结直肠癌细胞外基质的动态变化特征及其对上皮间质转化的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Arxiv

0+阅读 · 2023年3月14日

Best arm identification in rare events

Arxiv

0+阅读 · 2023年3月14日

Using Case Description Information to Reduce Sensitivity to Bias for the Attributable Fraction Among the Exposed

Arxiv

0+阅读 · 2023年3月14日

Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences

Arxiv

1+阅读 · 2023年3月14日

Reference-Guided Large-Scale Face Inpainting with Identity and Texture Control

Arxiv

0+阅读 · 2023年3月13日

Task and Motion Planning with Large Language Models for Object Rearrangement

Arxiv

0+阅读 · 2023年3月10日

Simple and efficient four-cycle counting on sparse graphs

Arxiv

0+阅读 · 2023年3月10日

MVImgNet: A Large-scale Dataset of Multi-view Images

Arxiv

0+阅读 · 2023年3月10日

MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling

Arxiv

0+阅读 · 2023年3月10日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Arxiv

0+阅读 · 2023年3月14日

Best arm identification in rare events

Arxiv

0+阅读 · 2023年3月14日

Using Case Description Information to Reduce Sensitivity to Bias for the Attributable Fraction Among the Exposed

Arxiv

0+阅读 · 2023年3月14日

Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences

Arxiv

1+阅读 · 2023年3月14日

Reference-Guided Large-Scale Face Inpainting with Identity and Texture Control

Arxiv

0+阅读 · 2023年3月13日

Task and Motion Planning with Large Language Models for Object Rearrangement

Arxiv

0+阅读 · 2023年3月10日

Simple and efficient four-cycle counting on sparse graphs

Arxiv

0+阅读 · 2023年3月10日

MVImgNet: A Large-scale Dataset of Multi-view Images

Arxiv

0+阅读 · 2023年3月10日

MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling

Arxiv

0+阅读 · 2023年3月10日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

相关基金

单分子拉曼散射过程非线性与相干性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

AEG-1对外周T细胞淋巴瘤恶性生物学行为的影响及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

蟾毒灵靶向糖酵解途径抗STRO1+/CD117+骨肉瘤肿瘤干细胞机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙离子对有机物形成正渗透膜污染的影响机制及调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于压力特征的变压器数字式非电量保护研究

国家自然科学基金

0+阅读 · 2012年12月31日

hsBAFF上调胞内钙离子激活B淋巴细胞的信号转导网络免疫调控机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

结直肠癌细胞外基质的动态变化特征及其对上皮间质转化的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员