数据增加是否从拆分批批次中得益 (Does Data Augmentation Benefit from Split BatchNorms) - 专知论文

会员服务 ·

0

数据增强 · Performer · state-of-the-art · 模型性能 · Performance ·

2020 年 10 月 15 日

Does Data Augmentation Benefit from Split BatchNorms

翻译：数据增加是否从拆分批批次中得益

Amil Merchant,Barret Zoph,Ekin Dogus Cubuk

from arxiv, 9 pages (+ 3 for references)

Data augmentation has emerged as a powerful technique for improving the performance of deep neural networks and led to state-of-the-art results in computer vision. However, state-of-the-art data augmentation strongly distorts training images, leading to a disparity between examples seen during training and inference. In this work, we explore a recently proposed training paradigm in order to correct for this disparity: using an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images. Our experiments then focus on how to define the BatchNorm parameters that are used at evaluation. To eliminate the train-test disparity, we experiment with using the batch statistics defined by clean training images only, yet surprisingly find that this does not yield improvements in model performance. Instead, we investigate using BatchNorm parameters defined by weak augmentations and find that this method significantly improves the performance of common image classification benchmarks such as CIFAR-10, CIFAR-100, and ImageNet. We then explore a fundamental trade-off between accuracy and robustness coming from using different BatchNorm parameters, providing greater insight into the benefits of data augmentation on model performance.

翻译：增强数据已成为改善深神经网络的功能的有力技术,并导致计算机视觉方面最先进的结果。然而,最先进的数据增强极大地扭曲了培训图像,导致在培训期间和推论期间所看到的例子之间存在差异。在这项工作中,我们探索了最近提出的培训范例,以纠正这一差异:利用辅助批量Norm来修复可能超出分布范围、放大的图像。然后,我们的实验侧重于如何界定评价中使用的批量Norm参数。为了消除火车测试差异,我们实验使用由清洁培训图像界定的批量统计数据,但令人惊讶的是,这并没有改善模型性能。相反,我们利用弱的增强所定义的批量Norm参数进行调查,发现这种方法大大改善了通用图像分类基准的性能,如CIFAR-10、CIFAR-100和图像网络。然后我们探索从使用不同的批量Norm参数中获得的准确性和稳健性之间的根本权衡,从而更清楚地了解模型性能扩大数据的效益。

3

相关内容

数据增强

数据增强在机器学习领域多指采用一些方法（比如数据蒸馏，正负样本均衡等）来提高模型数据集的质量，增强数据。

【NeurIPS 2020 - 斯坦福】知识图谱中多跳逻辑推理的Beta嵌入

【NeurIPS 2020 - 斯坦福】知识图谱中多跳逻辑推理的Beta嵌入

专知会员服务

45+阅读 · 2020年10月24日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

已删除

将门创投

3+阅读 · 2019年4月19日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

开发 | 图片数据集太少？看我七十二变，Keras Image Data Augmentation 各参数详解

开发 | 图片数据集太少？看我七十二变，Keras Image Data Augmentation 各参数详解

AI科技评论

4+阅读 · 2017年11月19日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

FenceBox: A Platform for Defeating Adversarial Examples with Data Augmentation Techniques

Arxiv

0+阅读 · 2020年12月3日

A Self-Supervised Feature Map Augmentation (FMA) Loss and Combined Augmentations Finetuning to Efficiently Improve the Robustness of CNNs

Arxiv

0+阅读 · 2020年12月2日

Data Augmentation with norm-VAE for Unsupervised Domain Adaptation

Arxiv

0+阅读 · 2020年12月1日

Generalization in Reinforcement Learning by Soft Data Augmentation

Arxiv

1+阅读 · 2020年11月26日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Unsupervised Data Augmentation for Consistency Training

Arxiv

5+阅读 · 2019年7月10日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Improved Training of Generative Adversarial Networks Using Representative Features

Arxiv

7+阅读 · 2018年1月28日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【NeurIPS 2020 - 斯坦福】知识图谱中多跳逻辑推理的Beta嵌入

【NeurIPS 2020 - 斯坦福】知识图谱中多跳逻辑推理的Beta嵌入

专知会员服务

45+阅读 · 2020年10月24日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

已删除

将门创投

3+阅读 · 2019年4月19日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

开发 | 图片数据集太少？看我七十二变，Keras Image Data Augmentation 各参数详解

开发 | 图片数据集太少？看我七十二变，Keras Image Data Augmentation 各参数详解

AI科技评论

4+阅读 · 2017年11月19日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

FenceBox: A Platform for Defeating Adversarial Examples with Data Augmentation Techniques

Arxiv

0+阅读 · 2020年12月3日

A Self-Supervised Feature Map Augmentation (FMA) Loss and Combined Augmentations Finetuning to Efficiently Improve the Robustness of CNNs

Arxiv

0+阅读 · 2020年12月2日

Data Augmentation with norm-VAE for Unsupervised Domain Adaptation

Arxiv

0+阅读 · 2020年12月1日

Generalization in Reinforcement Learning by Soft Data Augmentation

Arxiv

1+阅读 · 2020年11月26日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Unsupervised Data Augmentation for Consistency Training

Arxiv

5+阅读 · 2019年7月10日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Improved Training of Generative Adversarial Networks Using Representative Features

Arxiv

7+阅读 · 2018年1月28日

微信扫码咨询专知VIP会员