展望的元以前基线 (MetaFormer Baselines for Vision) - 专知论文

会员服务 ·

0

词元分析器 · MoDELS · 模型评估 · Performance · 分离的 ·

2022 年 12 月 22 日

MetaFormer Baselines for Vision

翻译：展望的元以前基线

Weihao Yu,Chenyang Si,Pan Zhou,Mi Luo,Yichen Zhou,Jiashi Feng,Shuicheng Yan,Xinchao Wang

from arxiv, Add more ImageNet-22K pretrained models. Code: https://github.com/sail-sg/metaformer

MetaFormer, the abstracted architecture of Transformer, has been found to play a significant role in achieving competitive performance. In this paper, we further explore the capacity of MetaFormer, again, without focusing on token mixer design: we introduce several baseline models under MetaFormer using the most basic or common mixers, and summarize our observations as follows: (1) MetaFormer ensures solid lower bound of performance. By merely adopting identity mapping as the token mixer, the MetaFormer model, termed IdentityFormer, achieves >80% accuracy on ImageNet-1K. (2) MetaFormer works well with arbitrary token mixers. When specifying the token mixer as even a random matrix to mix tokens, the resulting model RandFormer yields an accuracy of >81%, outperforming IdentityFormer. Rest assured of MetaFormer's results when new token mixers are adopted. (3) MetaFormer effortlessly offers state-of-the-art results. With just conventional token mixers dated back five years ago, the models instantiated from MetaFormer already beat state of the art. (a) ConvFormer outperforms ConvNeXt. Taking the common depthwise separable convolutions as the token mixer, the model termed ConvFormer, which can be regarded as pure CNNs, outperforms the strong CNN model ConvNeXt. (b) CAFormer sets new record on ImageNet-1K. By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85.5% at 224x224 resolution, under normal supervised training without external data or distillation. In our expedition to probe MetaFormer, we also find that a new activation, StarReLU, reduces 71% FLOPs of activation compared with GELU yet achieves better performance. We expect StarReLU to find great potential in MetaFormer-like models alongside other neural networks.

翻译：MetaFormer是变压器的抽象结构,在取得竞争性性能方面被发现能发挥重要作用。在本文中,我们再次探索MetaFormer的能力,而没有侧重于代币混合器的设计:我们在MetaFormer下使用最基本或最常用的混合器推出几个基准模型,并将我们的观察总结归纳如下:(1) MetaFormer确保了实绩的下限。MetaFormer 模式(称为身份Former)在图像Net-1K上实现了超过80%的精确度。(2) MetAFormer 与任意的代币搅拌器运作良好。当指定代币混合器为甚至随机的代币性矩阵时,产生的型号RandFormer将产生正常的精确度 >81%,比身份成熟。在采用新代币混合器时将MetFormereral的结果保证。MetaFormer努力提供了最新的结果。在五年前的常规混合搅拌器中发现,MetFormer的模型已经与艺术的状态相较强。 (aFremoderFormaldForlorld)

0

相关内容

词元分析器

词元分析器

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

无需新型token mixer就能SOTA：MetaFormer视觉基线模型开源，刷新ImageNet记录

无需新型token mixer就能SOTA：MetaFormer视觉基线模型开源，刷新ImageNet记录

机器之心

1+阅读 · 2022年12月1日

MetaFormer的视觉Baseline开源！颜水成团队再出马，顺带刷新ImageNet新记录！

MetaFormer的视觉Baseline开源！颜水成团队再出马，顺带刷新ImageNet新记录！

极市平台

0+阅读 · 2022年10月26日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

同步辐射技术研究Au-Cu双金属空心纳米颗粒的结构和性能

国家自然科学基金

0+阅读 · 2015年12月31日

负能隙可调新型半金属探索：结构设计、薄膜生长与物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

钨青铜结构碱土碱金属铌酸盐铁电陶瓷的A位离子设计、结构分析和电性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型石墨炔基氧还原电催化材料的制备及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

装甲陶瓷材料的动态性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于硫属化铕铁性材料的制备与磁电性质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

溶剂节约型烯烃关环复分解反应催化体系探索研究

国家自然科学基金

0+阅读 · 2009年12月31日

高核稀土羟基簇合物的合成及性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

BiFeO3基多铁性纳米陶瓷的制备及物性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新颖手性及非心结构金属硼酸盐的设计合成及性能

国家自然科学基金

0+阅读 · 2008年12月31日

Human MotionFormer: Transferring Human Motions with Vision Transformers

Arxiv

0+阅读 · 2023年2月22日

Dateformer: Time-modeling Transformer for Longer-term Series Forecasting

Arxiv

0+阅读 · 2023年2月21日

Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention

Arxiv

0+阅读 · 2023年2月21日

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Arxiv

0+阅读 · 2023年2月20日

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Arxiv

0+阅读 · 2023年2月17日

FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

Arxiv

10+阅读 · 2022年5月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

超越第一人称视角（FPV）无人机：汲取俄乌战争的全部教训

《瓦洛伦斯（ValoRens）项目 - 预测分析：解读敌方意图》

《国防仿真模型的优化与分析》

《利用大语言模型（LLM）优化海军陆战队经验教训学习》2025年最新103页

相关资讯

无需新型token mixer就能SOTA：MetaFormer视觉基线模型开源，刷新ImageNet记录

无需新型token mixer就能SOTA：MetaFormer视觉基线模型开源，刷新ImageNet记录

机器之心

1+阅读 · 2022年12月1日

MetaFormer的视觉Baseline开源！颜水成团队再出马，顺带刷新ImageNet新记录！

MetaFormer的视觉Baseline开源！颜水成团队再出马，顺带刷新ImageNet新记录！

极市平台

0+阅读 · 2022年10月26日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Human MotionFormer: Transferring Human Motions with Vision Transformers

Arxiv

0+阅读 · 2023年2月22日

Dateformer: Time-modeling Transformer for Longer-term Series Forecasting

Arxiv

0+阅读 · 2023年2月21日

Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention

Arxiv

0+阅读 · 2023年2月21日

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Arxiv

0+阅读 · 2023年2月20日

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Arxiv

0+阅读 · 2023年2月17日

FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

Arxiv

10+阅读 · 2022年5月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

相关基金

同步辐射技术研究Au-Cu双金属空心纳米颗粒的结构和性能

国家自然科学基金

0+阅读 · 2015年12月31日

负能隙可调新型半金属探索：结构设计、薄膜生长与物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

钨青铜结构碱土碱金属铌酸盐铁电陶瓷的A位离子设计、结构分析和电性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型石墨炔基氧还原电催化材料的制备及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

装甲陶瓷材料的动态性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于硫属化铕铁性材料的制备与磁电性质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

溶剂节约型烯烃关环复分解反应催化体系探索研究

国家自然科学基金

0+阅读 · 2009年12月31日

高核稀土羟基簇合物的合成及性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

BiFeO3基多铁性纳米陶瓷的制备及物性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新颖手性及非心结构金属硼酸盐的设计合成及性能

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员