MLP-混合器:全MLP愿景结构 (MLP-Mixer: An all-MLP Architecture for Vision) - 专知论文

会员服务 ·

0

Vision · 图片分类 · state-of-the-art · 卷积 · MoDELS ·

2021 年 5 月 17 日

MLP-Mixer: An all-MLP Architecture for Vision

翻译：MLP-混合器:全MLP愿景结构

Ilya Tolstikhin,Neil Houlsby,Alexander Kolesnikov,Lucas Beyer,Xiaohua Zhai,Thomas Unterthiner,Jessica Yung,Andreas Steiner,Daniel Keysers,Jakob Uszkoreit,Mario Lucic,Alexey Dosovitskiy

from arxiv, Fixed parameter counts in Table 1

Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficient for good performance, neither of them are necessary. We present MLP-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). MLP-Mixer contains two types of layers: one with MLPs applied independently to image patches (i.e. "mixing" the per-location features), and one with MLPs applied across patches (i.e. "mixing" spatial information). When trained on large datasets, or with modern regularization schemes, MLP-Mixer attains competitive scores on image classification benchmarks, with pre-training and inference cost comparable to state-of-the-art models. We hope that these results spark further research beyond the realms of well established CNNs and Transformers.

翻译：进化神经网络(CNNs)是计算机视觉的进化模型。最近,关注网络,如愿景变异器,也变得很受欢迎。在本文中,我们显示,虽然进化和关注都足以取得良好业绩,但两者都无必要。我们介绍了MLP-Mixer,这是一个完全基于多层感应器(MLP-Mixer)的架构。 MLP-Mixer包含两类层面:一个是MLPs,独立应用于图像补丁(即“混合”每个定位特征),另一个是MLPs,跨补接(即“混合”空间信息)应用MLPs。在接受大型数据集培训或现代正规化计划培训时,MLP-Mixer在图像分类基准上获得竞争性评分,其培训前和推断成本可与最新模型相比。我们希望这些结果能激发在已建立良好的CNN和变异器领域以外的进一步研究。

9

相关内容

Vision

2021机器学习研究风向是啥？MLP→CNN→Transformer→MLP！

2021机器学习研究风向是啥？MLP→CNN→Transformer→MLP！

专知会员服务

67+阅读 · 2021年5月23日

【CVPR2021】神经结构搜索的相对论性评价

专知会员服务

12+阅读 · 2021年3月25日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【O'Reilly AI Conference 2019】深度学习的容器化架构（Containerized architectures for deep learning），AWS的 AI和机器学习技术专家Antje Barth

【O'Reilly AI Conference 2019】深度学习的容器化架构（Containerized architectures for deep learning），AWS的 AI和机器学习技术专家Antje Barth

专知会员服务

10+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

专知

18+阅读 · 2018年7月15日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

Arxiv

3+阅读 · 2021年7月6日

Vision Xformers: Efficient Attention for Image Classification

Arxiv

0+阅读 · 2021年7月5日

Long-Short Transformer: Efficient Transformers for Language and Vision

Arxiv

0+阅读 · 2021年7月5日

PVTv2: Improved Baselines with Pyramid Vision Transformer

Arxiv

0+阅读 · 2021年7月5日

Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers

Arxiv

0+阅读 · 2021年7月3日

Pay Attention to MLPs

Arxiv

28+阅读 · 2021年5月17日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Arxiv

9+阅读 · 2021年3月25日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

2021机器学习研究风向是啥？MLP→CNN→Transformer→MLP！

2021机器学习研究风向是啥？MLP→CNN→Transformer→MLP！

专知会员服务

67+阅读 · 2021年5月23日

【CVPR2021】神经结构搜索的相对论性评价

专知会员服务

12+阅读 · 2021年3月25日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【O'Reilly AI Conference 2019】深度学习的容器化架构（Containerized architectures for deep learning），AWS的 AI和机器学习技术专家Antje Barth

【O'Reilly AI Conference 2019】深度学习的容器化架构（Containerized architectures for deep learning），AWS的 AI和机器学习技术专家Antje Barth

专知会员服务

10+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

专知

18+阅读 · 2018年7月15日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

相关论文

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

Arxiv

3+阅读 · 2021年7月6日

Vision Xformers: Efficient Attention for Image Classification

Arxiv

0+阅读 · 2021年7月5日

Long-Short Transformer: Efficient Transformers for Language and Vision

Arxiv

0+阅读 · 2021年7月5日

PVTv2: Improved Baselines with Pyramid Vision Transformer

Arxiv

0+阅读 · 2021年7月5日

Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers

Arxiv

0+阅读 · 2021年7月3日

Pay Attention to MLPs

Arxiv

28+阅读 · 2021年5月17日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Arxiv

9+阅读 · 2021年3月25日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

微信扫码咨询专知VIP会员