综合培训前综合变形器金字塔网络 (Integrally Pre-Trained Transformer Pyramid Networks) - 专知论文

会员服务 ·

0

Pyramid · Integration · MoDELS · Networking · 变换 ·

2022 年 11 月 23 日

Integrally Pre-Trained Transformer Pyramid Networks

翻译：综合培训前综合变形器金字塔网络

Yunjie Tian,Lingxi Xie,Zhaozhi Wang,Longhui Wei,Xiaopeng Zhang,Jianbin Jiao,Yaowei Wang,Qi Tian,Qixiang Ye

from arxiv, 13 pages, 5 figures, 13 tables

In this paper, we present an integral pre-training framework based on masked image modeling (MIM). We advocate for pre-training the backbone and neck jointly so that the transfer gap between MIM and downstream recognition tasks is minimal. We make two technical contributions. First, we unify the reconstruction and recognition necks by inserting a feature pyramid into the pre-training stage. Second, we complement mask image modeling (MIM) with masked feature modeling (MFM) that offers multi-stage supervision to the feature pyramid. The pre-trained models, termed integrally pre-trained transformer pyramid networks (iTPNs), serve as powerful foundation models for visual recognition. In particular, the base/large-level iTPN achieves an 86.2%/87.8% top-1 accuracy on ImageNet-1K, a 53.2%/55.6% box AP on COCO object detection with 1x training schedule using Mask-RCNN, and a 54.7%/57.7% mIoU on ADE20K semantic segmentation using UPerHead -- all these results set new records. Our work inspires the community to work on unifying upstream pre-training and downstream fine-tuning tasks. Code and the pre-trained models will be released at https://github.com/sunsmarterjie/iTPN.

翻译：在本文中,我们提出了一个基于蒙面图像建模(MIM)的综合培训前框架。我们共同倡导对骨干和颈部进行预培训,以便使MIM和下游识别任务之间的转移差距最小化。我们做出了两项技术贡献。首先,我们通过在培训前阶段插入一个特征金字塔将重建和识别脖子统一起来。第二,我们以蒙面特征建模(MIM)补充蒙面图像建模(MIM),为特征金字塔提供多阶段监督。预先培训的模型(iTPNs),被称为综合培训前的变形金字塔网络(iTPNs),作为视觉识别的强大基础模型。特别是,基础/大型 iTPNN在图像Net-1K上实现了86.2%/87.8%的最高-1精确度,在COO物体探测上实现了53.2%/55.6%的框 AP,在使用Mask-RCNNN和54.7%/57.7%的MIOU,关于使用UPer Checkations-train-train-train-trading Dredustrations.

0

相关内容

Pyramid

Pyramid is a small, fast, down-to-earth Python web application development framework.

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

肿瘤细胞中坏死基因Rip3的表达调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于石墨烯气凝胶的高比能量被动式微型直接甲醇燃料电池研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

交互式Petri网及其兼容性研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

恶性疟疾致病相关var基因非编码核酸序列的表达调控及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

MicroRNA在HBV感染中作用机理的研究

国家自然科学基金

0+阅读 · 2008年12月31日

MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural Retrieval

Arxiv

0+阅读 · 2023年1月25日

Efficient Training Under Limited Resources

Arxiv

0+阅读 · 2023年1月23日

Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

Arxiv

0+阅读 · 2023年1月22日

ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning

Arxiv

0+阅读 · 2023年1月22日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】基于物理的模拟

流匹配在生物学与生命科学中的应用综述

高质量数据集实践指南（1.0）

ICML 2025 关于语言模型机械可解释性的教程

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural Retrieval

Arxiv

0+阅读 · 2023年1月25日

Efficient Training Under Limited Resources

Arxiv

0+阅读 · 2023年1月23日

Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

Arxiv

0+阅读 · 2023年1月22日

ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning

Arxiv

0+阅读 · 2023年1月22日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

相关基金

肿瘤细胞中坏死基因Rip3的表达调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于石墨烯气凝胶的高比能量被动式微型直接甲醇燃料电池研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

交互式Petri网及其兼容性研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

恶性疟疾致病相关var基因非编码核酸序列的表达调控及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

MicroRNA在HBV感染中作用机理的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员