We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries, making the system task-aware. We aim to develop a general-purpose task-aware retrieval systems using multi-task instruction tuning that can follow human-written instructions to find the best documents for a given query. To this end, we introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, and present TART, a multi-task retrieval system trained on the diverse retrieval tasks with instructions. TART shows strong capabilities to adapt to a new task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup to better reflect real-world scenarios, pooling diverse documents and tasks. In this setup, TART significantly outperforms competitive baselines, further demonstrating the effectiveness of guiding retrieval with instructions.
翻译:我们研究用指示检索的问题,检索系统的用户在指示中明确描述其意图和查询,使系统具有任务意识;我们的目标是利用多任务指示调控,开发一个通用任务感检索系统,采用多任务性指示调控,可遵循人写指示,为特定查询找到最佳文件;为此,我们首次大规模收集约40个带有指示的检索数据集,并推出一个多任务性检索系统,这是一个多任务性检索系统,受过关于不同检索任务和指示的培训;TART显示有很强的能力通过指示适应新任务,并推进两个零点检索基准(BEIR和LOTTE)的先进水平,比业绩模型大三倍;我们进一步引入新的评价设置,以更好地反映现实世界的情景,汇集各种文件和任务;在设置中,TART大大超越了具有竞争力的基线,进一步展示了指导检索与指示的实效。