培训后批次批次调整 (Post-Training BatchNorm Recalibration) - 专知论文

会员服务 ·

0

Performer · 统计量 · Mac · 批量规范化 · MoDELS ·

2020 年 10 月 12 日

Post-Training BatchNorm Recalibration

翻译：培训后批次批次调整

Gil Shomron,Uri Weiser

We revisit non-blocking simultaneous multithreading (NB-SMT) introduced previously by Shomron and Weiser (2020). NB-SMT trades accuracy for performance by occasionally "squeezing" more than one thread into a shared multiply-and-accumulate (MAC) unit. However, the method of accommodating more than one thread in a shared MAC unit may contribute noise to the computations, thereby changing the internal statistics of the model. We show that substantial model performance can be recouped by post-training recalibration of the batch normalization layers' running mean and running variance statistics, given the presence of NB-SMT.

翻译：我们重新审视了Shomron和Weiser(2020年)以前引入的不阻塞同时同时进行多读(NB-SMT)的做法。NB-SMT通过偶尔“挤压”一个以上的线来交换性能准确性能,将一个以上的线性能“挤压”到一个共同的乘积(MAC)单位。然而,在共用的MAC单元中容纳一个以上线性能的方法可能会给计算工作造成噪音,从而改变模型的内部统计。我们表明,鉴于NB-SMT的存在,通过在培训后重新校准分批正常层运行的中值和运行差异统计,可以回收大量模型性能。

0

相关内容

Performer

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【经典书】Python数据数据分析第二版，541页pdf

【经典书】Python数据数据分析第二版，541页pdf

专知会员服务

195+阅读 · 2020年3月12日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

【2019-26期】This Week in Extracellular Vesicles

【2019-26期】This Week in Extracellular Vesicles

外泌体之家

11+阅读 · 2019年6月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Impact of covariance mismatched training samples on constant false alarm rate detectors

Arxiv

0+阅读 · 2020年12月2日

Mutual Information Constraints for Monte-Carlo Objectives

Arxiv

0+阅读 · 2020年12月1日

Solvable Model for Inheriting the Regularization through Knowledge Distillation

Arxiv

0+阅读 · 2020年12月1日

Training Invertible Linear Layers through Rank-One Perturbations

Arxiv

0+阅读 · 2020年12月1日

Smooth activations and reproducibility in deep networks

Arxiv

0+阅读 · 2020年12月1日

Multi-objective Neural Architecture Search with Almost No Training

Arxiv

0+阅读 · 2020年11月27日

Score-Based Generative Modeling through Stochastic Differential Equations

Arxiv

0+阅读 · 2020年11月26日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Training Generative Adversarial Networks Via Turing Test

Training Generative Adversarial Networks Via Turing Test

Arxiv

3+阅读 · 2018年10月25日

VIP会员

文章信息

相关主题

批量规范化

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【经典书】Python数据数据分析第二版，541页pdf

【经典书】Python数据数据分析第二版，541页pdf

专知会员服务

195+阅读 · 2020年3月12日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

【2019-26期】This Week in Extracellular Vesicles

【2019-26期】This Week in Extracellular Vesicles

外泌体之家

11+阅读 · 2019年6月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Impact of covariance mismatched training samples on constant false alarm rate detectors

Arxiv

0+阅读 · 2020年12月2日

Mutual Information Constraints for Monte-Carlo Objectives

Arxiv

0+阅读 · 2020年12月1日

Solvable Model for Inheriting the Regularization through Knowledge Distillation

Arxiv

0+阅读 · 2020年12月1日

Training Invertible Linear Layers through Rank-One Perturbations

Arxiv

0+阅读 · 2020年12月1日

Smooth activations and reproducibility in deep networks

Arxiv

0+阅读 · 2020年12月1日

Multi-objective Neural Architecture Search with Almost No Training

Arxiv

0+阅读 · 2020年11月27日

Score-Based Generative Modeling through Stochastic Differential Equations

Arxiv

0+阅读 · 2020年11月26日

Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

Arxiv

3+阅读 · 2020年10月15日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Training Generative Adversarial Networks Via Turing Test

Training Generative Adversarial Networks Via Turing Test

Arxiv

3+阅读 · 2018年10月25日

微信扫码咨询专知VIP会员