展望语言培训前模式调查 (A Survey of Vision-Language Pre-Trained Models) - 专知论文

会员服务 ·

0

Learning · MoDELS · 讲稿 · INTERACT · 多峰值 ·

2022 年 7 月 16 日

A Survey of Vision-Language Pre-Trained Models

翻译：展望语言培训前模式调查

Yifan Du,Zikang Liu,Junyi Li,Wayne Xin Zhao

from arxiv, Accepted by IJCAI-2022 survey track

As transformer evolves, pre-trained models have advanced at a breakneck pace in recent years. They have dominated the mainstream techniques in natural language processing (NLP) and computer vision (CV). How to adapt pre-training to the field of Vision-and-Language (V-L) learning and improve downstream task performance becomes a focus of multimodal learning. In this paper, we review the recent progress in Vision-Language Pre-Trained Models (VL-PTMs). As the core content, we first briefly introduce several ways to encode raw images and texts to single-modal embeddings before pre-training. Then, we dive into the mainstream architectures of VL-PTMs in modeling the interaction between text and image representations. We further present widely-used pre-training tasks, and then we introduce some common downstream tasks. We finally conclude this paper and present some promising research directions. Our survey aims to provide researchers with synthesis and pointer to related research.

翻译：随着变压器的演进,经过培训的模型近年来取得了突破性的进展,在自然语言处理(NLP)和计算机视觉(CV)中占主导地位的主流技术。如何使培训前的学习适应视野和语言(V-L)领域,改进下游任务业绩,成为多式联运学习的一个焦点。在本文件中,我们回顾了《愿景-语言-培训前模型》(VL-PTMs)最近的进展。作为核心内容,我们首先简要地介绍了将原始图像和文字编码成单式嵌入培训前的几种方法。然后,我们把VL-PTMs在模拟文本和图像代表之间互动方面跳入主流结构。我们进一步介绍了广泛使用的训练前任务,然后我们又介绍了一些共同的下游任务。我们最后完成了这份文件并提出了一些有希望的研究方向。我们的调查旨在为研究人员提供有关研究的合成和指针。

0

相关内容

Learning

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

新型三维纳米材料光伏电池的制备及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

车辆超载与氯盐侵蚀复合作用下钢筋砼梁桥的疲劳时变可靠度分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于舒曼谐振监测地震前兆异常机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

近断层高墩桥梁强震破坏机理及其抗倒塌性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

新能源城市电网中分布式能源与电动汽车接入的时空特性及统一实时调度方法

国家自然科学基金

0+阅读 · 2012年12月31日

（类）钙钛矿结构氧化物纳米纤维的高温电化学性能

国家自然科学基金

0+阅读 · 2012年12月31日

基于触变性电解质的准固态染料敏化太阳能电池的研究

国家自然科学基金

0+阅读 · 2012年12月31日

各向异性特异材料对电磁波脉冲调控作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

A Survey of Knowledge Enhanced Pre-trained Models

Arxiv

28+阅读 · 2021年10月1日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

VIP会员

文章信息

相关主题

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄罗斯核条令演变趋势》最新56页报告

【CMU博士论文】以人为中心的强化学习

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

认知优势：人工智能在国家安全决策中的核心作用

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

A Survey of Knowledge Enhanced Pre-trained Models

Arxiv

28+阅读 · 2021年10月1日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

相关基金

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

新型三维纳米材料光伏电池的制备及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

车辆超载与氯盐侵蚀复合作用下钢筋砼梁桥的疲劳时变可靠度分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于舒曼谐振监测地震前兆异常机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

近断层高墩桥梁强震破坏机理及其抗倒塌性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

新能源城市电网中分布式能源与电动汽车接入的时空特性及统一实时调度方法

国家自然科学基金

0+阅读 · 2012年12月31日

（类）钙钛矿结构氧化物纳米纤维的高温电化学性能

国家自然科学基金

0+阅读 · 2012年12月31日

基于触变性电解质的准固态染料敏化太阳能电池的研究

国家自然科学基金

0+阅读 · 2012年12月31日

各向异性特异材料对电磁波脉冲调控作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员