BigDL 2.0:从膝上型电脑到分布式集群的AI 输气管无缝缩放 (BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster) - 专知论文

会员服务 ·

0

簇 · 缩放 · CASES · HTTPS · 最优化 ·

2022 年 4 月 19 日

BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

翻译：BigDL 2.0:从膝上型电脑到分布式集群的AI 输气管无缝缩放

Jason Dai,Ding Ding,Dongjie Shi,Shengsheng Huang,Jiao Wang,Xin Qiu,Kai Huang,Guoqiong Song,Yang Wang,Qiyuan Gong,Jiaming Song,Shan Yu,Le Zheng,Yina Chen,Junwei Deng,Ge Song

from arxiv, Accepted by CVPR 2022 (Demo Track)

Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger dataset (for both experimentation and production deployment). These usually entail many manual and error-prone steps for the data scientists to fully take advantage of the available hardware resources (e.g., SIMD instructions, multi-processing, quantization, memory allocation optimization, data partitioning, distributed computing, etc.). To address this challenge, we have open sourced BigDL 2.0 at https://github.com/intel-analytics/BigDL/ under Apache 2.0 license (combining the original BigDL and Analytics Zoo projects); using BigDL 2.0, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then be transparently accelerated on a single node (with up-to 9.6x speedup in our experiments), and seamlessly scaled out to a large cluster (across several hundreds servers in real-world use cases). BigDL 2.0 has already been adopted by many real-world users (such as Mastercard, Burger King, Inspur, etc.) in production.

翻译：大部分AI项目都是从一台单笔笔记本上运行的Python笔记本开始;然而,一个人通常需要经历一个痛苦的山峰,才能处理更大的数据集(用于实验和生产部署),这通常需要数据科学家们用许多人工和容易出错的步骤来充分利用现有的硬件资源(例如SIMMD指令、多处理、量化、记忆分配优化、数据分割、分配计算等)。为了应对这一挑战,我们在https://github.com/intel-analytics/BigDL/Apache 2.0许可证下打开了BigDL2.0的源码源(合并原BigDL和Analytical Zoo项目);使用BigDL 2.0,用户可以简单地在他们的笔记本上建立常规的Python笔记本(可能提供自动MLU支持),然后在一个节点上透明地加速(在实验中达到9.6x的速度),并且无缝地扩展成一个大型集群(在现实世界中使用数百个服务器)。 BigDL 2.0已经由许多真实世界的用户采用。

0

相关内容

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

426+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

专知会员服务

19+阅读 · 2019年11月5日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

Serglycin调控TGF-β信号通路诱导EMT促进膀胱癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Pygo2在TGF-β信号刺激的乳腺癌上皮-间质转化（EMT）形成中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖化终末产物诱导胰岛β细胞炎性损伤的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向Deep Web的大规模知识库自动构建方法研究

国家自然科学基金

4+阅读 · 2011年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

小尺寸低压高速长保持力电荷陷阱型悬浮栅存储器的研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗PSMA适配子介导的细胞自噬和凋亡对前列腺癌的靶向杀伤作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

UPIb/U6嵌合型启动子靶向调控胸腺素β#34920;达对膀胱癌上皮-间质转化的作用机制

国家自然科学基金

0+阅读 · 2008年12月31日

Unsupervised Learning of the Total Variation Flow

Unsupervised Learning of the Total Variation Flow

Arxiv

0+阅读 · 2022年6月9日

Federated Learning Algorithms for Generalized Mixed-effects Model (GLMM) on Horizontally Partitioned Data from Distributed Sources

Arxiv

0+阅读 · 2022年6月7日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月7日

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

Arxiv

0+阅读 · 2022年6月7日

Effectiveness and Scalability of Fuzzing Techniques in CI/CD Pipelines

Effectiveness and Scalability of Fuzzing Techniques in CI/CD Pipelines

Arxiv

0+阅读 · 2022年6月7日

A Non-parametric Bayesian Model for Detecting Differential Item Functioning: An Application to Political Representation in the US

Arxiv

0+阅读 · 2022年6月7日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

19+阅读 · 2021年4月19日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

426+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

【O'Reilly AI Conference 2019】部署大规模分布式数据（How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE))，HPE BlueData，Thomas Phelan

专知会员服务

19+阅读 · 2019年11月5日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

相关论文

Unsupervised Learning of the Total Variation Flow

Unsupervised Learning of the Total Variation Flow

Arxiv

0+阅读 · 2022年6月9日

Federated Learning Algorithms for Generalized Mixed-effects Model (GLMM) on Horizontally Partitioned Data from Distributed Sources

Arxiv

0+阅读 · 2022年6月7日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月7日

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

Arxiv

0+阅读 · 2022年6月7日

Effectiveness and Scalability of Fuzzing Techniques in CI/CD Pipelines

Effectiveness and Scalability of Fuzzing Techniques in CI/CD Pipelines

Arxiv

0+阅读 · 2022年6月7日

A Non-parametric Bayesian Model for Detecting Differential Item Functioning: An Application to Political Representation in the US

Arxiv

0+阅读 · 2022年6月7日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

19+阅读 · 2021年4月19日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

相关基金

Serglycin调控TGF-β信号通路诱导EMT促进膀胱癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Pygo2在TGF-β信号刺激的乳腺癌上皮-间质转化（EMT）形成中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖化终末产物诱导胰岛β细胞炎性损伤的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向Deep Web的大规模知识库自动构建方法研究

国家自然科学基金

4+阅读 · 2011年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

小尺寸低压高速长保持力电荷陷阱型悬浮栅存储器的研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗PSMA适配子介导的细胞自噬和凋亡对前列腺癌的靶向杀伤作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

UPIb/U6嵌合型启动子靶向调控胸腺素β#34920;达对膀胱癌上皮-间质转化的作用机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员