CODER:一个有效框架,通过以COntext化文件嵌入文件重新排行改进检索 (CODER: An efficient framework for improving retrieval through COntextualized Document Embedding Reranking) - 专知论文

会员服务 ·

0

Performer · 得分 · MoDELS · 情景 · 基 ·

2021 年 12 月 16 日

CODER: An efficient framework for improving retrieval through COntextualized Document Embedding Reranking

翻译：CODER:一个有效框架,通过以COntext化文件嵌入文件重新排行改进检索

George Zerveas,Navid Rekabsaz,Daniel Cohen,Carsten Eickhoff

We present a framework for improving the performance of a wide class of retrieval models at minimal computational cost. It utilizes precomputed document representations extracted by a base dense retrieval method and involves training a model to jointly score a large set of retrieved candidate documents for each query, while potentially transforming on the fly the representation of each document in the context of the other candidates as well as the query itself. When scoring a document representation based on its similarity to a query, the model is thus aware of the representation of its "peer" documents. We show that our approach leads to substantial improvement in retrieval performance over the base method and over scoring candidate documents in isolation from one another, as in a pair-wise training setting. Crucially, unlike term-interaction rerankers based on BERT-like encoders, it incurs a negligible computational overhead on top of any first-stage method at run time, allowing it to be easily combined with any state-of-the-art dense retrieval method. Finally, concurrently considering a set of candidate documents for a given query enables additional valuable capabilities in retrieval, such as score calibration and mitigating societal biases in ranking.

翻译：我们提出了一个框架,以最低计算成本改进一系列广泛的检索模型的性能,它使用一种基础密集检索方法得出的预先计算的文件表述方法,并涉及培训一种模型,以共同得分每个查询的大量检索的候选文件,同时可能自动地改变其他候选人和查询本身对每个文件的表述方式。在根据与查询相似之处评分一个文件表示方式时,该模型因此了解其“同侪”文件的表述方式。我们表明,我们的方法大大改进了检索方法的性能,并导致在相互隔离的情况下对候选文件进行评分,正如在对等培训设置中那样。关键是,它不同于基于像BERT一样的术语间重新排序,在任何第一阶段方法的顶部,其计算间接费用微不足道,因此可以很容易地与任何最先进的密集检索方法结合起来。最后,我们同时考虑一套特定查询的候选文件,使得在检索方面具有额外的宝贵能力,例如得分校准和减少排名中的社会偏差。

0

相关内容

Performer

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

61+阅读 · 2020年8月6日

近期必读的五篇 ICML 2020【图神经网络 (GNN) 】相关论文_Part2

近期必读的五篇 ICML 2020【图神经网络 (GNN) 】相关论文_Part2

专知会员服务

76+阅读 · 2020年7月14日

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

81+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

近期必读的5篇顶会WWW 2020【图神经网络（GNN）】相关论文-Part2

近期必读的5篇顶会WWW 2020【图神经网络（GNN）】相关论文-Part2

专知会员服务

72+阅读 · 2020年3月11日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

3+阅读 · 2019年1月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新九篇机器翻译相关论文—深度多任务学习、深度RNNs、注意焦点、多源神经机器翻译

【论文推荐】最新九篇机器翻译相关论文—深度多任务学习、深度RNNs、注意焦点、多源神经机器翻译

专知

8+阅读 · 2018年6月21日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Arxiv

4+阅读 · 2021年5月8日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval

Arxiv

10+阅读 · 2021年2月22日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

CEDR: Contextualized Embeddings for Document Ranking

Arxiv

4+阅读 · 2019年8月19日

A Simple BERT-Based Approach for Lexical Simplification

A Simple BERT-Based Approach for Lexical Simplification

Arxiv

6+阅读 · 2019年7月16日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

61+阅读 · 2020年8月6日

近期必读的五篇 ICML 2020【图神经网络 (GNN) 】相关论文_Part2

近期必读的五篇 ICML 2020【图神经网络 (GNN) 】相关论文_Part2

专知会员服务

76+阅读 · 2020年7月14日

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

81+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

近期必读的5篇顶会WWW 2020【图神经网络（GNN）】相关论文-Part2

近期必读的5篇顶会WWW 2020【图神经网络（GNN）】相关论文-Part2

专知会员服务

72+阅读 · 2020年3月11日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

3+阅读 · 2019年1月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新九篇机器翻译相关论文—深度多任务学习、深度RNNs、注意焦点、多源神经机器翻译

【论文推荐】最新九篇机器翻译相关论文—深度多任务学习、深度RNNs、注意焦点、多源神经机器翻译

专知

8+阅读 · 2018年6月21日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Arxiv

4+阅读 · 2021年5月8日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval

Arxiv

10+阅读 · 2021年2月22日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

CEDR: Contextualized Embeddings for Document Ranking

Arxiv

4+阅读 · 2019年8月19日

A Simple BERT-Based Approach for Lexical Simplification

A Simple BERT-Based Approach for Lexical Simplification

Arxiv

6+阅读 · 2019年7月16日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

微信扫码咨询专知VIP会员