走向不受监督的感官信息检索与违规学习 (Towards Unsupervised Dense Information Retrieval with Contrastive Learning) - 专知论文

会员服务 ·

0

INFORMS · BM25 · contrastive · 信息检索 · 对比学习 ·

2021 年 12 月 16 日

Towards Unsupervised Dense Information Retrieval with Contrastive Learning

翻译：走向不受监督的感官信息检索与违规学习

Gautier Izacard,Mathilde Caron,Lucas Hosseini,Sebastian Riedel,Piotr Bojanowski,Armand Joulin,Edouard Grave

Information retrieval is an important component in natural language processing, for knowledge intensive tasks such as question answering and fact checking. Recently, information retrieval has seen the emergence of dense retrievers, based on neural networks, as an alternative to classical sparse methods based on term-frequency. These models have obtained state-of-the-art results on datasets and tasks where large training sets are available. However, they do not transfer well to new domains or applications with no training data, and are often outperformed by term-frequency methods such as BM25 which are not supervised. Thus, a natural question is whether it is possible to train dense retrievers without supervision. In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers, and show that it leads to strong retrieval performance. More precisely, we show on the BEIR benchmark that our model outperforms BM25 on 11 out of 15 datasets. Furthermore, when a few thousands examples are available, we show that fine-tuning our model on these leads to strong improvements compared to BM25. Finally, when used as pre-training before fine-tuning on the MS-MARCO dataset, our technique obtains state-of-the-art results on the BEIR benchmark.

翻译：在自然语言处理中,信息检索是自然语言处理、知识密集型任务(如问答和事实检查)的一个重要部分。最近,信息检索发现在神经网络的基础上出现了密集的检索器,作为基于定期频率的经典稀疏方法的替代方法。这些模型在数据集和大型培训成套任务方面获得了最先进的结果。然而,它们没有顺利地转移到没有培训数据的新领域或应用程序,而且往往比没有监督的BM25等术语频率方法表现得更好。因此,一个自然的问题是,能否在没有监督的情况下培训密集的检索器。在这项工作中,我们探索对比学习的局限性,作为培训不受监督的密集检索器的一种方法,并表明它能够带来很强的检索性能。更准确地说,我们在BER基准基准中显示,我们的模型在15个数据集中的11个方面超过了BM25。此外,如果有数千个例子,我们展示了这些模型的微调模式与B25相比,将会带来强大的改进。最后,当在对BM25进行精准之前,我们用于对BIS-CO基准数据设置技术进行预先培训时,我们获得的MS-CO-M-MAR数据库数据。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

专知会员服务

137+阅读 · 2020年2月6日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

对比学习（Contrastive Learning）相关进展梳理

对比学习（Contrastive Learning）相关进展梳理

PaperWeekly

11+阅读 · 2020年5月12日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知

6+阅读 · 2020年4月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答

【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答

专知

9+阅读 · 2018年5月10日

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

专知

12+阅读 · 2018年5月9日

Learning Weakly-Supervised Contrastive Representations

Learning Weakly-Supervised Contrastive Representations

Arxiv

1+阅读 · 2022年2月18日

Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning

Arxiv

0+阅读 · 2022年2月18日

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Max-Margin Contrastive Learning

Max-Margin Contrastive Learning

Arxiv

18+阅读 · 2021年12月21日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Contrastive Learning with Hard Negative Samples

Arxiv

3+阅读 · 2021年1月24日

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Arxiv

9+阅读 · 2020年12月31日

Hard Negative Mixing for Contrastive Learning

Arxiv

5+阅读 · 2020年10月2日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

VIP会员

文章信息

相关主题

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

专知会员服务

137+阅读 · 2020年2月6日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

对比学习（Contrastive Learning）相关进展梳理

对比学习（Contrastive Learning）相关进展梳理

PaperWeekly

11+阅读 · 2020年5月12日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知

6+阅读 · 2020年4月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答

【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答

专知

9+阅读 · 2018年5月10日

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

专知

12+阅读 · 2018年5月9日

相关论文

Learning Weakly-Supervised Contrastive Representations

Learning Weakly-Supervised Contrastive Representations

Arxiv

1+阅读 · 2022年2月18日

Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning

Arxiv

0+阅读 · 2022年2月18日

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Max-Margin Contrastive Learning

Max-Margin Contrastive Learning

Arxiv

18+阅读 · 2021年12月21日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Contrastive Learning with Hard Negative Samples

Arxiv

3+阅读 · 2021年1月24日

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Arxiv

9+阅读 · 2020年12月31日

Hard Negative Mixing for Contrastive Learning

Arxiv

5+阅读 · 2020年10月2日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

微信扫码咨询专知VIP会员