启动音量搜索 (Boosted Dense Retriever) - 专知论文

会员服务 ·

0

Boosting（一种模型训练加速方式） · 可约的 · MoDELS · Performer · 连结 ·

2021 年 12 月 14 日

Boosted Dense Retriever

翻译：启动音量搜索

Patrick Lewis,Barlas Oğuz,Wenhan Xiong,Fabio Petroni,Wen-tau Yih,Sebastian Riedel

We propose DrBoost, a dense retrieval ensemble inspired by boosting. DrBoost is trained in stages: each component model is learned sequentially and specialized by focusing only on retrieval mistakes made by the current ensemble. The final representation is the concatenation of the output vectors of all the component models, making it a drop-in replacement for standard dense retrievers at test time. DrBoost enjoys several advantages compared to standard dense retrieval models. It produces representations which are 4x more compact, while delivering comparable retrieval results. It also performs surprisingly well under approximate search with coarse quantization, reducing latency and bandwidth needs by another 4x. In practice, this can make the difference between serving indices from disk versus from memory, paving the way for much cheaper deployments.

翻译：我们建议DrBoost, 这是一种由提升启发的密集检索组合。DrBoost是分阶段培训的:每个部件模型都是按顺序学习的,专门化的,只注重当前组合体的检索错误。最后的表述是所有部件模型的输出矢量的融合,使它在测试时成为标准密度检索器的低位替代物。DrBoost与标准的密度检索模型相比,享有若干优势。它产生4x的显示力更为紧凑,同时提供可比较的检索结果。它也令人惊讶地表现在接近接近的搜索量之下,粗微的量化,将耐用量和带宽需求再减少4x。在实践中,这可以区分磁盘与记忆之间的服务指数,为更便宜的部署铺路。

0

相关内容

Boosting（一种模型训练加速方式）

Boosting（一种模型训练加速方式）

【ACL2021】为密集检索生成伪查询嵌入来改进文档表示

专知会员服务

7+阅读 · 2021年8月7日

【WWW2021】神经协同推理

专知会员服务

58+阅读 · 2021年5月17日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【AAAI 2019 Tutorial】可伸缩的个性化推荐检索的最新进展（Recent Advances in Scalable Retrieval of Personalized Recommendations），Dung D. Le，Hady W. Lauw

【AAAI 2019 Tutorial】可伸缩的个性化推荐检索的最新进展（Recent Advances in Scalable Retrieval of Personalized Recommendations），Dung D. Le，Hady W. Lauw

专知会员服务

4+阅读 · 2019年11月18日

【ACL 2019 Tutorials】把维基百科作为文本分析和检索的资源（Wikipedia as a Resource for Text Analysis and Retrieval），Marius Pasca

【ACL 2019 Tutorials】把维基百科作为文本分析和检索的资源（Wikipedia as a Resource for Text Analysis and Retrieval），Marius Pasca

专知会员服务

7+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

已删除

将门创投

3+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Implementing Spiking Neural Networks on Neuromorphic Architectures: A Review

Arxiv

0+阅读 · 2022年2月17日

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval

Arxiv

6+阅读 · 2021年10月12日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Arxiv

4+阅读 · 2021年5月8日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Arxiv

4+阅读 · 2018年4月10日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

相关VIP内容

【ACL2021】为密集检索生成伪查询嵌入来改进文档表示

专知会员服务

7+阅读 · 2021年8月7日

【WWW2021】神经协同推理

专知会员服务

58+阅读 · 2021年5月17日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【AAAI 2019 Tutorial】可伸缩的个性化推荐检索的最新进展（Recent Advances in Scalable Retrieval of Personalized Recommendations），Dung D. Le，Hady W. Lauw

【AAAI 2019 Tutorial】可伸缩的个性化推荐检索的最新进展（Recent Advances in Scalable Retrieval of Personalized Recommendations），Dung D. Le，Hady W. Lauw

专知会员服务

4+阅读 · 2019年11月18日

【ACL 2019 Tutorials】把维基百科作为文本分析和检索的资源（Wikipedia as a Resource for Text Analysis and Retrieval），Marius Pasca

【ACL 2019 Tutorials】把维基百科作为文本分析和检索的资源（Wikipedia as a Resource for Text Analysis and Retrieval），Marius Pasca

专知会员服务

7+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多域空战指挥体系：驾驭复杂性的艺术》

构建军事人工智能信任体系始于破除黑盒机制

《生态建模密码破译：建模与编程实践》美陆军最新报告

《战争形态演变：合成兵种防御主导模式探析》48页slides

相关资讯

已删除

将门创投

3+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Implementing Spiking Neural Networks on Neuromorphic Architectures: A Review

Arxiv

0+阅读 · 2022年2月17日

Improving Biomedical Information Retrieval with Neural Retrievers

Arxiv

6+阅读 · 2022年1月19日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval

Arxiv

6+阅读 · 2021年10月12日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Arxiv

4+阅读 · 2021年5月8日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Arxiv

4+阅读 · 2018年4月10日

微信扫码咨询专知VIP会员