" 低精度培训中的错误交易记忆 " : " 低精度培训中的错误交易记忆 " 。 (How Low Can We Go: Trading Memory for Error in Low-Precision Training) - 专知论文

会员服务 ·

0

查准率/准确率 · Less · 舍入误差 · 估计/估计量 · tuning ·

2021 年 6 月 18 日

How Low Can We Go: Trading Memory for Error in Low-Precision Training

翻译：" 低精度培训中的错误交易记忆 " : " 低精度培训中的错误交易记忆 " 。

Chengrun Yang,Ziyang Wu,Jerry Chee,Christopher De Sa,Madeleine Udell

Low-precision arithmetic trains deep learning models using less energy, less memory and less time. However, we pay a price for the savings: lower precision may yield larger round-off error and hence larger prediction error. As applications proliferate, users must choose which precision to use to train a new model, and chip manufacturers must decide which precisions to manufacture. We view these precision choices as a hyperparameter tuning problem, and borrow ideas from meta-learning to learn the tradeoff between memory and error. In this paper, we introduce Pareto Estimation to Pick the Perfect Precision (PEPPP). We use matrix factorization to find non-dominated configurations (the Pareto frontier) with a limited number of network evaluations. For any given memory budget, the precision that minimizes error is a point on this frontier. Practitioners can use the frontier to trade memory for error and choose the best precision for their goals.

翻译：低精度算术用较少的能量、记忆和时间来培养深层次学习模型。但是,我们为节省开支付出了代价: 低精度可能导致更大的回合错误, 从而导致更大的预测错误。随着应用的激增, 用户必须选择用来训练新模型的精确度, 芯片制造商必须决定制造的精确度。我们认为这些精确度选择是一个超参数调问题, 从元学习中借用想法来学习记忆和错误的权衡。在本文中, 我们引入了Pareto Estimation来选择完美精度(PPPPPPP ) 。我们使用矩阵化来寻找非主控配置( Pareto边框 ), 以及数量有限的网络评价。对于任何特定的记忆预算, 尽量减少错误的精确度是这一边框的一个点。执行者可以使用边框来交换记忆错误, 并选择其目标的最佳精确度。

0

相关内容

查准率/准确率

查准率/准确率

【硬核书】量子信息理论，598页pdf

专知会员服务

127+阅读 · 2021年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google 76分钟训练万BERT最新论文】Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

【Google 76分钟训练万BERT最新论文】Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

专知会员服务

4+阅读 · 2020年1月7日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

5+阅读 · 2017年10月20日

A Recursive Logit Model with Choice Aversion and Its Application to Transportation Networks

A Recursive Logit Model with Choice Aversion and Its Application to Transportation Networks

Arxiv

0+阅读 · 2021年8月18日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

BlockDrop: Dynamic Inference Paths in Residual Networks

Arxiv

6+阅读 · 2018年3月30日

A Framework for Evaluating 6-DOF Object Trackers

Arxiv

6+阅读 · 2018年3月28日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

Depth-Adaptive Computational Policies for Efficient Visual Tracking

Arxiv

8+阅读 · 2018年1月1日

Long-Term Visual Object Tracking Benchmark

Arxiv

7+阅读 · 2017年12月28日

VIP会员

文章信息

相关主题

查准率/准确率

估计/估计量

相关VIP内容

【硬核书】量子信息理论，598页pdf

专知会员服务

127+阅读 · 2021年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google 76分钟训练万BERT最新论文】Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

【Google 76分钟训练万BERT最新论文】Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

专知会员服务

4+阅读 · 2020年1月7日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

5+阅读 · 2017年10月20日

相关论文

A Recursive Logit Model with Choice Aversion and Its Application to Transportation Networks

A Recursive Logit Model with Choice Aversion and Its Application to Transportation Networks

Arxiv

0+阅读 · 2021年8月18日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

BlockDrop: Dynamic Inference Paths in Residual Networks

Arxiv

6+阅读 · 2018年3月30日

A Framework for Evaluating 6-DOF Object Trackers

Arxiv

6+阅读 · 2018年3月28日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

Depth-Adaptive Computational Policies for Efficient Visual Tracking

Arxiv

8+阅读 · 2018年1月1日

Long-Term Visual Object Tracking Benchmark

Arxiv

7+阅读 · 2017年12月28日

微信扫码咨询专知VIP会员