Much recent research on information retrieval has focused on how to transfer from one task (typically with abundant supervised data) to various other tasks where supervision is limited, with the implicit assumption that it is possible to generalize from one task to all the rest. However, this overlooks the fact that there are many diverse and unique retrieval tasks, each targeting different search intents, queries, and search domains. In this paper, we suggest to work on Few-shot Dense Retrieval, a setting where each task comes with a short description and a few examples. To amplify the power of a few examples, we propose Prompt-base Query Generation for Retriever (Promptagator), which leverages large language models (LLM) as a few-shot query generator, and creates task-specific retrievers based on the generated data. Powered by LLM's generalization ability, Promptagator makes it possible to create task-specific end-to-end retrievers solely based on a few examples {without} using Natural Questions or MS MARCO to train %question generators or dual encoders. Surprisingly, LLM prompting with no more than 8 examples allows dual encoders to outperform heavily engineered models trained on MS MARCO like ColBERT v2 by more than 1.2 nDCG on average on 11 retrieval sets. Further training standard-size re-rankers using the same generated data yields another 5.0 point nDCG improvement. Our studies determine that query generation can be far more effective than previously observed, especially when a small amount of task-specific knowledge is given.
翻译:最近许多关于信息检索的研究都集中在如何从一个任务(通常有丰富的监督数据)转移到监督有限的其他任务(通常有丰富的监督数据),而隐含的假设是,有可能从一个任务向所有其他任务推广。然而,这忽略了这样一个事实,即有许多多样化和独特的检索任务,每个任务都针对不同的搜索意图、查询和搜索领域。在本文件中,我们建议只根据少量的Dense Retreaval开展工作,每个任务都有一个简短的描述和几个例子。为了扩大几个例子的力量,我们建议快速基地Retriever Query Connection for retriever (Promptagator),它将大型语言模型(LLM)作为几发查询器,并创建基于生成数据的特有任务性的检索器。根据LMM的概括能力,Teradagator使建立具体任务端对终端至终端的检索器仅以几个例子为基础,{不作任何例外}。为了扩大几个例子,我们使用已观察到的小问题或MS MARCO来培训%的Retracury Contraders (Protra) (Promptragator) (Reportal) (PLMMMMMMM) leadlyly) legilling more le) 其数量比我们所要用的高级的高级的双重数据模型更像 更能用一个比高级的更深入研订的DNA模型造数据模型, 更久。