移动前:连接移动网络和变换器 (Mobile-Former: Bridging MobileNet and Transformer) - 专知论文

会员服务 ·

0

MobileNetv3 · MobileNet · 变换 · 计算成本 · Backbone ·

2021 年 12 月 9 日

Mobile-Former: Bridging MobileNet and Transformer

翻译：移动前:连接移动网络和变换器

Yinpeng Chen,Xiyang Dai,Dongdong Chen,Mengchen Liu,Xiaoyi Dong,Lu Yuan,Zicheng Liu

We present Mobile-Former, a parallel design of MobileNet and transformer with a two-way bridge in between. This structure leverages the advantages of MobileNet at local processing and transformer at global interaction. And the bridge enables bidirectional fusion of local and global features. Different from recent works on vision transformer, the transformer in Mobile-Former contains very few tokens (e.g. 6 or fewer tokens) that are randomly initialized to learn global priors, resulting in low computational cost. Combining with the proposed light-weight cross attention to model the bridge, Mobile-Former is not only computationally efficient, but also has more representation power. It outperforms MobileNetV3 at low FLOP regime from 25M to 500M FLOPs on ImageNet classification. For instance, Mobile-Former achieves 77.9\% top-1 accuracy at 294M FLOPs, gaining 1.3\% over MobileNetV3 but saving 17\% of computations. When transferring to object detection, Mobile-Former outperforms MobileNetV3 by 8.6 AP in RetinaNet framework. Furthermore, we build an efficient end-to-end detector by replacing backbone, encoder and decoder in DETR with Mobile-Former, which outperforms DETR by 1.1 AP but saves 52\% of computational cost and 36\% of parameters.

翻译：我们展示了移动- Former, 移动网络和变压器的平行设计, 中间有双向桥梁。这个结构在本地处理和变压器上利用了移动网络的优势, 在全球互动中, 这个结构在本地处理和变压器上利用了移动- Former 的优势。这个桥可以使本地和全球特性双向融合。不同于最近关于视觉变压器的工程, 移动- Former 的变压器含有很少的随机初始化符号( 例如6个或更少的表示器), 以学习全球前科, 导致计算成本低。结合了拟议对模拟桥的轻度交叉关注, 移动- Former 不仅计算效率高, 而且还具有更大的代表力。它在低FLOP 系统下, 从 25M 到 500M FLOP 的移动- FLOPs 上优于移动网络 3 。例如, 移动- Flive- Former 的变压器实现了77.9+ 1的精度精确度, 在移动- Net 3 和 DETR 格式框架中, 以最终取代了的。

0

相关内容

MobileNetv3

【KDD2021】用NAS实现任务无关且可动态调整尺寸的BERT压缩

【KDD2021】用NAS实现任务无关且可动态调整尺寸的BERT压缩

专知会员服务

17+阅读 · 2021年9月2日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

专知会员服务

19+阅读 · 2020年4月9日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Sentence-BERT: 一种能快速计算句子相似度的孪生网络

Sentence-BERT: 一种能快速计算句子相似度的孪生网络

AINLP

5+阅读 · 2020年5月27日

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

重磅！MobileNetV3 来了！

重磅！MobileNetV3 来了！

计算机视觉life

4+阅读 · 2019年5月11日

BERT大火却不懂Transformer？读这一篇就够了

BERT大火却不懂Transformer？读这一篇就够了

大数据文摘

11+阅读 · 2019年1月8日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

一文读懂目标检测模型（附论文资源）

一文读懂目标检测模型（附论文资源）

数据派THU

8+阅读 · 2018年5月27日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Transformer in Transformer

Arxiv

11+阅读 · 2021年10月26日

ResT: An Efficient Transformer for Visual Recognition

Arxiv

3+阅读 · 2021年10月14日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

MST: Masked Self-Supervised Transformer for Visual Representation

Arxiv

4+阅读 · 2021年6月10日

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

Arxiv

8+阅读 · 2021年5月30日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】用NAS实现任务无关且可动态调整尺寸的BERT压缩

【KDD2021】用NAS实现任务无关且可动态调整尺寸的BERT压缩

专知会员服务

17+阅读 · 2021年9月2日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

专知会员服务

19+阅读 · 2020年4月9日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Sentence-BERT: 一种能快速计算句子相似度的孪生网络

Sentence-BERT: 一种能快速计算句子相似度的孪生网络

AINLP

5+阅读 · 2020年5月27日

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

重磅！MobileNetV3 来了！

重磅！MobileNetV3 来了！

计算机视觉life

4+阅读 · 2019年5月11日

BERT大火却不懂Transformer？读这一篇就够了

BERT大火却不懂Transformer？读这一篇就够了

大数据文摘

11+阅读 · 2019年1月8日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

一文读懂目标检测模型（附论文资源）

一文读懂目标检测模型（附论文资源）

数据派THU

8+阅读 · 2018年5月27日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Transformer in Transformer

Arxiv

11+阅读 · 2021年10月26日

ResT: An Efficient Transformer for Visual Recognition

Arxiv

3+阅读 · 2021年10月14日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

MST: Masked Self-Supervised Transformer for Visual Representation

Arxiv

4+阅读 · 2021年6月10日

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

Arxiv

8+阅读 · 2021年5月30日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

微信扫码咨询专知VIP会员