理解变形器对图像分类的强大性 (Understanding Robustness of Transformers for Image Classification) - 专知论文

会员服务 ·

0

稳健性 · 图片分类 · 变换 · ResNet · 可理解性 ·

2021 年 10 月 8 日

Understanding Robustness of Transformers for Image Classification

翻译：理解变形器对图像分类的强大性

Srinadh Bhojanapalli,Ayan Chakrabarti,Daniel Glasner,Daliang Li,Thomas Unterthiner,Andreas Veit

from arxiv, Accepted for publication at ICCV 2021. Rewrote Section 5 and made other minor changes throughout

Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification. However, details of the Transformer architecture -- such as the use of non-overlapping patches -- lead one to wonder whether these networks are as robust. In this paper, we perform an extensive study of a variety of different measures of robustness of ViT models and compare the findings to ResNet baselines. We investigate robustness to input perturbations as well as robustness to model perturbations. We find that when pre-trained with a sufficient amount of data, ViT models are at least as robust as the ResNet counterparts on a broad range of perturbations. We also find that Transformers are robust to the removal of almost any single layer, and that while activations from later layers are highly correlated with each other, they nevertheless play an important role in classification.

翻译：深革命神经网络(CNNs) 长期以来一直是计算机视觉任务的首选架构。最近, 视觉变异器(View 变异器)等基于变异器的架构已经匹配甚至超过了 ResNet, 用于图像分类。然而, 变异器架构的细节 — — 例如使用非重叠的补丁 — — 让人怀疑这些网络是否同样强大。在本文中, 我们广泛研究了各种维变器模型的稳健度衡量标准, 并将结果与 ResNet 基线进行比较。我们调查了输入扰动的稳健性, 以及模拟扰动的稳健性。我们发现, 在经过足够数量的数据培训之前, ViT 模型至少和 ResNet 对应方在广泛的扰动方面一样强大。我们还发现, 变异器在几乎清除任何单一层方面都很强大, 而后层的启动机制彼此高度关联, 但是它们在分类中扮演着重要的角色。

0

相关内容

稳健性

【NUS-Xavier教授】注意力神经网络，79页ppt

【NUS-Xavier教授】注意力神经网络，79页ppt

专知会员服务

66+阅读 · 2021年11月25日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

108+阅读 · 2020年8月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

专知会员服务

6+阅读 · 2019年11月24日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Masked-attention Mask Transformer for Universal Image Segmentation

Arxiv

0+阅读 · 2021年12月2日

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

Arxiv

0+阅读 · 2021年11月30日

Transformer in Transformer

Arxiv

11+阅读 · 2021年10月26日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

A Universal Representation Transformer Layer for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年9月2日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Visualizing Attention in Transformer-Based Language Representation Models

Visualizing Attention in Transformer-Based Language Representation Models

Arxiv

3+阅读 · 2019年4月11日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

【NUS-Xavier教授】注意力神经网络，79页ppt

【NUS-Xavier教授】注意力神经网络，79页ppt

专知会员服务

66+阅读 · 2021年11月25日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

108+阅读 · 2020年8月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

专知会员服务

6+阅读 · 2019年11月24日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

自动驾驶轨迹规划中的基础模型：进展综述与开放挑战

《用于提升多域战备的大型语言模型辅助场景生成器》报告

【斯坦福博士论文】为人类使用优化 AI 模型

国防领域人工智能规模化应用的理论与实践

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Masked-attention Mask Transformer for Universal Image Segmentation

Arxiv

0+阅读 · 2021年12月2日

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

Arxiv

0+阅读 · 2021年11月30日

Transformer in Transformer

Arxiv

11+阅读 · 2021年10月26日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

A Universal Representation Transformer Layer for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年9月2日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Visualizing Attention in Transformer-Based Language Representation Models

Visualizing Attention in Transformer-Based Language Representation Models

Arxiv

3+阅读 · 2019年4月11日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

微信扫码咨询专知VIP会员