自我监督无助于大规模自然语言监督 (Self Supervision Does Not Help Natural Language Supervision at Scale) - 专知论文

会员服务 ·

0

监督 · 缩放 · SLIP · Notability · Excel ·

2023 年 1 月 20 日

Self Supervision Does Not Help Natural Language Supervision at Scale

翻译：自我监督无助于大规模自然语言监督

Floris Weers,Vaishaal Shankar,Angelos Katharopoulos,Yinfei Yang,Tom Gunter

Self supervision and natural language supervision have emerged as two exciting ways to train general purpose image encoders which excel at a variety of downstream tasks. Recent works such as M3AE and SLIP have suggested that these approaches can be effectively combined, but most notably their results use small pre-training datasets (<50M samples) and don't effectively reflect the large-scale regime (>100M examples) that is commonly used for these approaches. Here we investigate whether a similar approach can be effective when trained with a much larger amount of data. We find that a combination of two state of the art approaches: masked auto-encoders, MAE and contrastive language image pre-training, CLIP provides a benefit over CLIP when trained on a corpus of 11.3M image-text pairs, but little to no benefit (as evaluated on a suite of common vision tasks) over CLIP when trained on a large corpus of 1.4B images. Our work provides some much needed clarity into the effectiveness (or lack thereof) of self supervision for large-scale image-text training.

翻译：自我监督和自然语言监督是培训通用图像编码器的两种令人兴奋的方法,在一系列下游任务中十分出色,例如M3AE和SLIP等最近的工作表明,这些方法可以有效地结合起来,但最显著的是,其结果使用小型培训前数据集(<50M样本),不能有效地反映这些方法通常使用的大规模制度(>100M实例),我们在这里调查,在培训大量数据时,类似方法是否有效。我们发现,两种先进的方法相结合:蒙面自动编码器、MAE和对比语言图像预培训前,CLIP在接受11.3M图像文本组合培训时,对CLIP是一种好处,但在接受大型图像文本培训时,对CLIP的自我监督效力(或缺乏这种效力)非常需要澄清。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

SRC-1介导Wnt/β-catenin信号通路在恐惧记忆再巩固中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

玉米干旱响应的自然反义转录本鉴定及其启动子克隆与功能分析

国家自然科学基金

0+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

单分子磁体的输运电流噪声特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

Hg2CuTi型全Heusler合金表面与界面的半金属特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

镧系-钨（钼）氧化物固溶体的中子衍射研究

国家自然科学基金

0+阅读 · 2011年12月31日

碳纳米颗粒荧光发射高效可调机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新疆维吾尔族恶性淋巴瘤TNF基因表达及多态性的研究

国家自然科学基金

0+阅读 · 2008年12月31日

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models

Arxiv

0+阅读 · 2023年3月14日

AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+

Arxiv

0+阅读 · 2023年3月14日

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

Arxiv

0+阅读 · 2023年3月13日

Scaling Vision-Language Models with Sparse Mixture of Experts

Arxiv

0+阅读 · 2023年3月13日

The challenge of representation learning: Improved accuracy in deep vision models does not come with better predictions of perceptual similarity

Arxiv

0+阅读 · 2023年3月13日

No Reason for No Supervision: Improved Generalization in Supervised Models

Arxiv

0+阅读 · 2023年3月10日

Robotic Applications of Pre-Trained Vision-Language Models to Various Recognition Behaviors

Arxiv

0+阅读 · 2023年3月10日

Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors

Arxiv

0+阅读 · 2023年3月9日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models

Arxiv

0+阅读 · 2023年3月14日

AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+

Arxiv

0+阅读 · 2023年3月14日

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

Arxiv

0+阅读 · 2023年3月13日

Scaling Vision-Language Models with Sparse Mixture of Experts

Arxiv

0+阅读 · 2023年3月13日

The challenge of representation learning: Improved accuracy in deep vision models does not come with better predictions of perceptual similarity

Arxiv

0+阅读 · 2023年3月13日

No Reason for No Supervision: Improved Generalization in Supervised Models

Arxiv

0+阅读 · 2023年3月10日

Robotic Applications of Pre-Trained Vision-Language Models to Various Recognition Behaviors

Arxiv

0+阅读 · 2023年3月10日

Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors

Arxiv

0+阅读 · 2023年3月9日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

相关基金

SRC-1介导Wnt/β-catenin信号通路在恐惧记忆再巩固中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

玉米干旱响应的自然反义转录本鉴定及其启动子克隆与功能分析

国家自然科学基金

0+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

单分子磁体的输运电流噪声特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

Hg2CuTi型全Heusler合金表面与界面的半金属特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

镧系-钨（钼）氧化物固溶体的中子衍射研究

国家自然科学基金

0+阅读 · 2011年12月31日

碳纳米颗粒荧光发射高效可调机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新疆维吾尔族恶性淋巴瘤TNF基因表达及多态性的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员