微批量批量批量流:允许培训 DNN 模型在内存约束环境中使用大批量大小 (Micro Batch Streaming: Allowing the Training of DNN Models to Use a large Batch Size in Memory Constrained Environments) - 专知论文

会员服务 ·

0

Batch Size · MoDELS · 记忆容量 · 流 · Learning ·

2022 年 11 月 30 日

Micro Batch Streaming: Allowing the Training of DNN Models to Use a large Batch Size in Memory Constrained Environments

翻译：微批量批量批量流:允许培训 DNN 模型在内存约束环境中使用大批量大小

XinYu Piao,DoangJoo Synn,JooYoung Park,Jong-Kook Kim

from arxiv, In submitted

Recent deep learning models are difficult to train using a large batch size, because commodity machines may not have enough memory to accommodate both the model and a large data batch size. The batch size is one of the hyper-parameters used in the training model, and it is dependent on and is limited by the target machine memory capacity because the batch size can only fit into the remaining memory after the model is uploaded. Moreover, the data item size is also an important factor because if each data item size is larger then the batch size that can fit into the remaining memory becomes smaller. This paper proposes a framework called Micro-Batch Streaming (MBS) to address this problem. This method helps deep learning models to train by providing a batch streaming method that splits a batch into a size that can fit in the remaining memory and streams them sequentially. A loss normalization algorithm based on the gradient accumulation is used to maintain the performance. The purpose of our method is to allow deep learning models to train using larger batch sizes that exceed the memory capacity of a system without increasing the memory size or using multiple devices (GPUs).

翻译：最近深层次学习模型很难使用大批量尺寸来培训,因为商品机器可能没有足够的内存来容纳模型和大数据批量尺寸。批量尺寸是培训模型中使用的超参数之一,它取决于目标机器内存能力,并且受目标机内存能力的限制,因为批量大小只能在模型上传后才适合剩余内存。此外,数据项大小也是一个重要因素,因为如果每个数据项的大小较大,那么适合剩余内存的批量大小就会变小。本文提出一个称为微批量储存的框架来解决这个问题。这个方法有助于通过提供批量流法来培训深层学习模型,将批量分成分成分成成一个能够与剩余内存相容的大小,并按顺序流。基于渐变累积的失常正常化算法用于保持性能。我们方法的目的是允许深学习模型在不增加内存大小或使用多个设备的情况下,使用超过系统内存能力的大批量尺寸的批量规模的培训。

0

相关内容

Batch Size

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【ACML2020】张量网络机器学习:最近的进展和前沿，109页ppt

【ACML2020】张量网络机器学习:最近的进展和前沿，109页ppt

专知会员服务

55+阅读 · 2020年12月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

分子凝胶中有机化合物的结晶机制

国家自然科学基金

0+阅读 · 2014年12月31日

微泡诱导与超声强化过冷水分阶段结晶的多尺度耦合机理与控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于X射线微米、亚微米聚焦的非球面镜弹性压弯机理及关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

Airy光束等非平面波束的非线性光学性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

非晶合金在低温机械合金化难溶元素超饱和固溶过程中作用与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

耳叶水苋对ALS抑制剂的抗药性机理

国家自然科学基金

0+阅读 · 2011年12月31日

CaO-MgO-SiO2-H2O体系中MgO、CaO与SiO2的反应机制、调控及硅酸钙镁复合胶凝材料的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Plug-In混合动力汽车能量管理及动力系统优化问题研究

国家自然科学基金

1+阅读 · 2008年12月31日

Stable Target Field for Reduced Variance Score Estimation in Diffusion Models

Arxiv

0+阅读 · 2023年2月1日

Stream-based active learning with linear models

Arxiv

0+阅读 · 2023年2月1日

What Makes Good Examples for Visual In-Context Learning?

Arxiv

0+阅读 · 2023年1月31日

MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning

Arxiv

0+阅读 · 2023年1月30日

Constrained Parameter Inference as a Principle for Learning

Arxiv

0+阅读 · 2023年1月27日

AdaBoost is not an Optimal Weak to Strong Learner

Arxiv

0+阅读 · 2023年1月27日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【ACML2020】张量网络机器学习:最近的进展和前沿，109页ppt

【ACML2020】张量网络机器学习:最近的进展和前沿，109页ppt

专知会员服务

55+阅读 · 2020年12月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Stable Target Field for Reduced Variance Score Estimation in Diffusion Models

Arxiv

0+阅读 · 2023年2月1日

Stream-based active learning with linear models

Arxiv

0+阅读 · 2023年2月1日

What Makes Good Examples for Visual In-Context Learning?

Arxiv

0+阅读 · 2023年1月31日

MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning

Arxiv

0+阅读 · 2023年1月30日

Constrained Parameter Inference as a Principle for Learning

Arxiv

0+阅读 · 2023年1月27日

AdaBoost is not an Optimal Weak to Strong Learner

Arxiv

0+阅读 · 2023年1月27日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

相关基金

分子凝胶中有机化合物的结晶机制

国家自然科学基金

0+阅读 · 2014年12月31日

微泡诱导与超声强化过冷水分阶段结晶的多尺度耦合机理与控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于X射线微米、亚微米聚焦的非球面镜弹性压弯机理及关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

Airy光束等非平面波束的非线性光学性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

非晶合金在低温机械合金化难溶元素超饱和固溶过程中作用与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

耳叶水苋对ALS抑制剂的抗药性机理

国家自然科学基金

0+阅读 · 2011年12月31日

CaO-MgO-SiO2-H2O体系中MgO、CaO与SiO2的反应机制、调控及硅酸钙镁复合胶凝材料的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Plug-In混合动力汽车能量管理及动力系统优化问题研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员