渐进压缩记录:从深学习数据中取出字节 (Progressive Compressed Records: Taking a Byte out of Deep Learning Data) - 专知论文

会员服务 ·

0

可约的 · Storage · 学成 · 数据集 · 深度学习 ·

2021 年 8 月 11 日

Progressive Compressed Records: Taking a Byte out of Deep Learning Data

翻译：渐进压缩记录:从深学习数据中取出字节

Michael Kuchnik,George Amvrosiadis,Virginia Smith

Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks and storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior to training. We introduce Progressive Compressed Records (PCRs), a data format that uses compression to reduce the overhead of fetching and transporting data, effectively reducing the training time required to achieve a target accuracy. PCRs deviate from previous storage formats by combining progressive compression with an efficient storage layout to view a single dataset at multiple fidelities---all without adding to the total dataset size. We implement PCRs and evaluate them on a range of datasets, training tasks, and hardware architectures. Our work shows that: (i) the amount of compression a dataset can tolerate exceeds 50% of the original encoding for many DL training tasks; (ii) it is possible to automatically and efficiently select appropriate compression levels for a given task; and (iii) PCRs enable tasks to readily access compressed data at runtime---utilizing as little as half the training bandwidth and thus potentially doubling training speed.

翻译：深度学习加速器对大量且不断增长的数据进行高效培训,给商品网络和存储装置带来新的负担。一种常见的保护带宽做法是在培训前调整或压缩数据。我们引入了累进压缩记录(PCR),这是一种数据格式,使用压缩来减少提取和运输数据的间接费用,有效地减少了实现目标准确性所需的培训时间。PCR不同于以前的储存格式,将渐进压缩与高效的储存布局结合起来,以查看多个忠诚-全的单一数据集,而不增加数据集的总尺寸。我们实施了PCR,并对数据集、培训任务和硬件结构进行了评估。我们的工作显示:(一) 压缩一个数据集的数量可以容忍超过许多DL培训任务的原有编码的50%;(二) 有可能为某项任务自动和高效率地选择适当的压缩水平;(三) PCR使各项任务能够在运行时很容易地获取压缩数据,只达到一半的培训带宽度,从而可能使培训速度翻番。

0

相关内容

可约的

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

已删除

将门创投

11+阅读 · 2019年7月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training

Arxiv

0+阅读 · 2021年10月11日

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

Arxiv

0+阅读 · 2021年10月8日

Deep Contextual Video Compression

Arxiv

5+阅读 · 2021年9月30日

Task Programming: Learning Data Efficient Behavior Representations

Arxiv

4+阅读 · 2021年3月29日

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

FML: Face Model Learning from Videos

Arxiv

5+阅读 · 2018年12月18日

Transfer Learning with Neural AutoML

Arxiv

5+阅读 · 2018年9月11日

Deep learning in agriculture: A survey

Arxiv

11+阅读 · 2018年7月31日

VIP会员

文章信息

相关主题

相关VIP内容

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

已删除

将门创投

11+阅读 · 2019年7月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training

Arxiv

0+阅读 · 2021年10月11日

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

Arxiv

0+阅读 · 2021年10月8日

Deep Contextual Video Compression

Arxiv

5+阅读 · 2021年9月30日

Task Programming: Learning Data Efficient Behavior Representations

Arxiv

4+阅读 · 2021年3月29日

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

FML: Face Model Learning from Videos

Arxiv

5+阅读 · 2018年12月18日

Transfer Learning with Neural AutoML

Arxiv

5+阅读 · 2018年9月11日

Deep learning in agriculture: A survey

Arxiv

11+阅读 · 2018年7月31日

微信扫码咨询专知VIP会员