作为培训前判刑代表的统一学习者 (Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation) - 专知论文

会员服务 ·

0

MS MARCO · MoDELS · 基 · Performer · 学习器 ·

2022 年 7 月 30 日

Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation

翻译：作为培训前判刑代表的统一学习者

Alexander Liu,Samuel Yang

Despite the progresses on pre-trained language models, there is a lack of unified frameworks for pre-trained sentence representation. As such, it calls for different pre-training methods for specific scenarios, and the pre-trained models are likely to be limited by their universality and representation quality. In this work, we extend the recently proposed MAE style pre-training strategy, RetroMAE, such that it may effectively support a wide variety of sentence representation tasks. The extended framework consists of two stages, with RetroMAE conducted throughout the process. The first stage performs RetroMAE over generic corpora, like Wikipedia, BookCorpus, etc., from which the base model is learned. The second stage takes place on domain-specific data, e.g., MS MARCO and NLI, where the base model is continuingly trained based on RetroMAE and contrastive learning. The pre-training outputs at the two stages may serve different applications, whose effectiveness are verified with comprehensive experiments. Concretely, the base model are proved to be effective for zero-shot retrieval, with remarkable performances achieved on BEIR benchmark. The continuingly pre-trained models further benefit more downstream tasks, including the domain-specific dense retrieval on MS MARCO, Natural Questions, and the sentence embeddings' quality for standard STS and transfer tasks in SentEval. The empirical insights of this work may inspire the future design of sentence representation pre-training. Our pre-trained models and source code will be released to the public communities.

翻译：尽管在经过培训的语文模式方面取得了进展,但缺乏关于经过培训的判刑说明的统一框架。因此,它要求对具体情景采用不同的培训前方法,而经过培训的模型可能因其普遍性和代表性质量而受到限制。在这项工作中,我们延长了最近提议的MAE风格培训前战略,即RetroMAE, 以便有效地支持范围广泛的各种判刑说明任务。扩展框架由两个阶段组成,在整个过程中进行RetroMAE。第一阶段是比通用的Corpora(如维基百科、BookCorpus等)进行RetroMAE,从中学习基础模型。第二阶段是在特定领域数据上进行,例如MS MARCO和NLI, 基础模型在RetroMAE和对比性学习的基础上继续接受培训。两个阶段的训练前产出可能起到不同的应用作用,其有效性经过全面试验核实。基准模型已证明对零光源检索有效,其基础模型在BIR基准上取得了显著的成绩。第二阶段是特定领域,即MARCO和NLILIL, 继续更新我们的标准格式,包括升级前的升级前和升级前的索引前的标准任务。

0

相关内容

MS MARCO

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

NEFA介导内质网应激对奶牛肝细胞自噬和脂质代谢的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

外泌体（Exosome）在小肠上皮损伤修复的作用机制及甘草的干预研究

国家自然科学基金

0+阅读 · 2014年12月31日

斑马鱼二倍及单倍体胚胎干细胞培养和鉴定体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于诱发机理的区域降雨型滑坡预报模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

页岩气藏开采基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

内质网应激介导的代谢综合症致脑损伤的分子机制及花青素干预的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于两步嵌插机理的抗肿瘤β#21652;啉设计与3 D QSAR研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用定量构效关系模型研究抗氧化肽构效关系

国家自然科学基金

0+阅读 · 2009年12月31日

生物质微波热解定向转化合成气的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Arxiv

0+阅读 · 2022年9月28日

UniCLIP: Unified Framework for Contrastive Language-Image Pre-training

Arxiv

0+阅读 · 2022年9月27日

Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

Arxiv

0+阅读 · 2022年9月26日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Arxiv

0+阅读 · 2022年9月28日

UniCLIP: Unified Framework for Contrastive Language-Image Pre-training

Arxiv

0+阅读 · 2022年9月27日

Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

Arxiv

0+阅读 · 2022年9月26日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

相关基金

NEFA介导内质网应激对奶牛肝细胞自噬和脂质代谢的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

外泌体（Exosome）在小肠上皮损伤修复的作用机制及甘草的干预研究

国家自然科学基金

0+阅读 · 2014年12月31日

斑马鱼二倍及单倍体胚胎干细胞培养和鉴定体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于诱发机理的区域降雨型滑坡预报模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

页岩气藏开采基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

内质网应激介导的代谢综合症致脑损伤的分子机制及花青素干预的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于两步嵌插机理的抗肿瘤β#21652;啉设计与3 D QSAR研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用定量构效关系模型研究抗氧化肽构效关系

国家自然科学基金

0+阅读 · 2009年12月31日

生物质微波热解定向转化合成气的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员