带有指令的任务感检索 (Task-aware Retrieval with Instructions)

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can follow human-written instructions to find the best documents for a given query. We introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, BERRI, and present TART, a multi-task retrieval system trained on BERRI with instructions. TART shows strong capabilities to adapt to a new retrieval task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup, X^2-Retrieval to better reflect real-world scenarios, where diverse domains and tasks are pooled and a system needs to find documents aligning users' intents. In this setup, TART significantly outperforms competitive baselines, further demonstrating the effectiveness of guiding retrieval with instructions.

翻译：我们研究用指示检索的问题,检索系统的用户在指示中明确描述其意图和查询;我们的目标是利用多任务指示调调,开发一个通用任务感检索系统,该系统可以遵循人写的指示,为特定查询寻找最佳文件;我们首次大规模收集大约40个检索数据集,并附有指示,BERRI, 并介绍一个多任务检索系统,这是一个在BERRI上培训的多任务检索系统,通过指示显示适应新的检索任务的强大能力,并通过两个零点检索基准(BEIR和LOTTE)提高最新水平,比性能模型(BEIR和LOTTE)高三倍;我们进一步推出一个新的评价设置,X ⁇ 2-Reearval,以更好地反映现实世界情景,其中将不同的领域和任务集中在一起,并需要系统找到文件,使用户的意图一致。在这一设置中,TART显著地改进竞争性基线,进一步展示指导检索与指示的实效。

相关内容

AIM

关注 656

医学人工智能AIM（Artificial Intelligence in Medicine）杂志发表了多学科领域的原创文章，涉及医学中的人工智能理论和实践，以医学为导向的人类生物学和卫生保健。医学中的人工智能可以被描述为与研究、项目和应用相关的科学学科，旨在通过基于知识或数据密集型的计算机解决方案支持基于决策的医疗任务，最终支持和改善人类护理提供者的性能。官网地址：http://dblp.uni-trier.de/db/journals/artmed/

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日