通过最大限度地实现多模式相互信息最大化实现所有在一对一的预培训 (Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information) - 专知论文

会员服务 ·

0

INFORMS · 互信息 · 极大 · 预训练 · Performer ·

2022 年 11 月 17 日

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

翻译：通过最大限度地实现多模式相互信息最大化实现所有在一对一的预培训

Weijie Su,Xizhou Zhu,Chenxin Tao,Lewei Lu,Bin Li,Gao Huang,Yu Qiao,Xiaogang Wang,Jie Zhou,Jifeng Dai

To effectively exploit the potential of large-scale models, various pre-training strategies supported by massive data from different sources are proposed, including supervised pre-training, weakly-supervised pre-training, and self-supervised pre-training. It has been proved that combining multiple pre-training strategies and data from various modalities/sources can greatly boost the training of large-scale models. However, current works adopt a multi-stage pre-training system, where the complex pipeline may increase the uncertainty and instability of the pre-training. It is thus desirable that these strategies can be integrated in a single-stage manner. In this paper, we first propose a general multi-modal mutual information formula as a unified optimization target and demonstrate that all existing approaches are special cases of our framework. Under this unified perspective, we propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training). Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, COCO object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation. Notably, we successfully pre-train a billion-level parameter image backbone and achieve state-of-the-art performance on various benchmarks. Code shall be released.

翻译：为了有效地利用大型模式的潜力,提出了各种培训前战略,并辅之以来自不同来源的大量数据,包括受监督的训练前、受监督的训练前和自监督的训练前和自监督的训练前;已经证明,将多种培训前战略和各种模式/来源的数据结合起来,可以大大促进大型模式的培训;然而,目前的工作采用一个多阶段的训练前系统,复杂的培训前系统可能会增加培训前的不确定性和不稳定性;因此,最好能够将这些战略纳入一个阶段;在本文件中,我们首先提出一个一般的多模式相互信息公式,作为统一优化的目标,并表明所有现有办法都是我们框架的特殊情况;在这一统一的观点下,我们提议一个全阶段的单一培训前方法,名为优化多模式的相互信息前培训(M3I培训前),我们的方法在各种愿景基准方面,包括图像网络分类、COCO物体探测、LVIS前的多模式目标探测和STADE20级基本标准,将成功实现我们10亿个阶段的成绩。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Y-box结合蛋白1/Wnt5a途径诱导上皮间质转化促进胰腺癌干细胞侵袭转移的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三江平原阿布胶河流域氮素输出分异特征及溯源研究

国家自然科学基金

0+阅读 · 2014年12月31日

CXCR4与IL-35联合基因修饰间充质干细胞对溃疡性结肠炎局部免疫平衡的调节及清热燥湿凉血方的协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯基纳米复合有机玻璃的可控制备及光学非线性

国家自然科学基金

0+阅读 · 2012年12月31日

小RNA如何调控互为反义的CDK4/TSPAN31和CCND1/LOC15基因的表达及肝癌细胞的增殖与凋亡

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

Tiam1基因在肝癌转移中的作用及其网络调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Arxiv

16+阅读 · 2022年3月25日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

《军事行动中的人机协同共同学习》2025最新文献

代理式人工智能时代的决策优势

《F/A-18机队替换中队仿真模型的设计与分析》2025最新73页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Arxiv

16+阅读 · 2022年3月25日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

相关基金

Y-box结合蛋白1/Wnt5a途径诱导上皮间质转化促进胰腺癌干细胞侵袭转移的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三江平原阿布胶河流域氮素输出分异特征及溯源研究

国家自然科学基金

0+阅读 · 2014年12月31日

CXCR4与IL-35联合基因修饰间充质干细胞对溃疡性结肠炎局部免疫平衡的调节及清热燥湿凉血方的协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯基纳米复合有机玻璃的可控制备及光学非线性

国家自然科学基金

0+阅读 · 2012年12月31日

小RNA如何调控互为反义的CDK4/TSPAN31和CCND1/LOC15基因的表达及肝癌细胞的增殖与凋亡

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

Tiam1基因在肝癌转移中的作用及其网络调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员