GRIT-VLP: 集成小型批量抽样,促进高效愿景和语文预培训 (GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training) - 专知论文

会员服务 ·

0

Vision · MINE · Performer · 样本 · 负例 ·

2022 年 8 月 8 日

GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training

翻译：GRIT-VLP: 集成小型批量抽样,促进高效愿景和语文预培训

Jaeseok Byun,Taebaek Hwang,Jianlong Fu,Taesup Moon

Most of the currently existing vision and language pre-training (VLP) methods have mainly focused on how to extract and align vision and text features. In contrast to the mainstream VLP methods, we highlight that two routinely applied steps during pre-training have crucial impact on the performance of the pre-trained model: in-batch hard negative sampling for image-text matching (ITM) and assigning the large masking probability for the masked language modeling (MLM). After empirically showing the unexpected effectiveness of above two steps, we systematically devise our GRIT-VLP, which adaptively samples mini-batches for more effective mining of hard negative samples for ITM while maintaining the computational cost for pre-training. Our method consists of three components: 1) GRouped mIni-baTch sampling (GRIT) strategy that collects similar examples in a mini-batch, 2) ITC consistency loss for improving the mining ability, and 3) enlarged masking probability for MLM. Consequently, we show our GRIT-VLP achieves a new state-of-the-art performance on various downstream tasks with much less computational cost. Furthermore, we demonstrate that our model is essentially in par with ALBEF, the previous state-of-the-art, only with one-third of training epochs on the same training data. Code is available at https://github.com/jaeseokbyun/GRIT-VLP.

翻译：与主流VLP方法不同,我们强调,与主流VLP方法相反,培训前阶段通常采用的两个步骤对培训前模式的性能有重大影响:1)Grouped mIn-baTch抽样(GRIT)战略,其中收集了微型批中的类似实例,2)国贸中心在提高采矿能力方面的一致性损失,以及3)MLM的扩大遮盖性可能性。因此,我们向GRIT-VLP展示了各种下游任务的新状态,其基本成本是最低成本。

0

相关内容

Vision

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

干扰素诱导基因ASB13拮抗流感病毒复制机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自噬清除活化的炎性体抑制DAMPs诱导的肺损伤的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

ICOSL/ICOS信号调控Tfh分化介导日本血吸虫病免疫应答及其免疫病理机制

国家自然科学基金

0+阅读 · 2014年12月31日

PARP-1调控急性肺损伤中中性粒细胞浸润和活化的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

CHOP 调控ERO1α在急性肝损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Numbl-TRAF6-TAB2对NF-kappa B活性的调节在小胶质细胞炎性活化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

紫薯花青素调控内皮细胞老化的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Sonic hedgehog信号通路促进卵巢癌转移机制研究及靶向治疗

国家自然科学基金

0+阅读 · 2011年12月31日

Batten Disease (BD)神经元退化病理机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ask Me Anything: A simple strategy for prompting language models

Arxiv

1+阅读 · 2022年10月5日

Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment

Arxiv

0+阅读 · 2022年10月5日

Self-Distilled Vision Transformer for Domain Generalization

Arxiv

0+阅读 · 2022年10月4日

Language-Aware Soft Prompting for Vision & Language Foundation Models

Arxiv

0+阅读 · 2022年10月3日

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning

Arxiv

4+阅读 · 2022年10月3日

A contrastive rule for meta-learning

Arxiv

1+阅读 · 2022年10月3日

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

Arxiv

0+阅读 · 2022年9月30日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《陆军战斗操练中的关键事件诊断》

《自适应训练辅助概念及其在空战管理员加速训练中的应用导论》最新126页

军事通信市场七大趋势概述

《抗干扰无人机蜂群行为的遗传算法方法》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

相关论文

Ask Me Anything: A simple strategy for prompting language models

Arxiv

1+阅读 · 2022年10月5日

Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment

Arxiv

0+阅读 · 2022年10月5日

Self-Distilled Vision Transformer for Domain Generalization

Arxiv

0+阅读 · 2022年10月4日

Language-Aware Soft Prompting for Vision & Language Foundation Models

Arxiv

0+阅读 · 2022年10月3日

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning

Arxiv

4+阅读 · 2022年10月3日

A contrastive rule for meta-learning

Arxiv

1+阅读 · 2022年10月3日

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

Arxiv

0+阅读 · 2022年9月30日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

干扰素诱导基因ASB13拮抗流感病毒复制机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自噬清除活化的炎性体抑制DAMPs诱导的肺损伤的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

ICOSL/ICOS信号调控Tfh分化介导日本血吸虫病免疫应答及其免疫病理机制

国家自然科学基金

0+阅读 · 2014年12月31日

PARP-1调控急性肺损伤中中性粒细胞浸润和活化的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

CHOP 调控ERO1α在急性肝损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Numbl-TRAF6-TAB2对NF-kappa B活性的调节在小胶质细胞炎性活化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

紫薯花青素调控内皮细胞老化的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Sonic hedgehog信号通路促进卵巢癌转移机制研究及靶向治疗

国家自然科学基金

0+阅读 · 2011年12月31日

Batten Disease (BD)神经元退化病理机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员