Various techniques have been developed in recent years to improve dense retrieval (DR), such as unsupervised contrastive learning and pseudo-query generation. Existing DRs, however, often suffer from effectiveness tradeoffs between supervised and zero-shot retrieval, which some argue was due to the limited model capacity. We contradict this hypothesis and show that a generalizable DR can be trained to achieve high accuracy in both supervised and zero-shot retrieval without increasing model size. In particular, we systematically examine the contrastive learning of DRs, under the framework of Data Augmentation (DA). Our study shows that common DA practices such as query augmentation with generative models and pseudo-relevance label creation using a cross-encoder, are often inefficient and sub-optimal. We hence propose a new DA approach with diverse queries and sources of supervision to progressively train a generalizable DR. As a result, DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations and even competes with models using more complex late interaction (ColBERTv2 and SPLADE++).
翻译:近些年来,开发了各种技术来改进密集检索(DR),例如未经监督的对比学习和假拼写生成;然而,现有的DR往往在受监督检索和零照检索之间发生效力权衡,有些理由是模型容量有限;我们反驳了这一假设,并表明可以对通用的DR进行培训,以便在监督和零照检索中实现高精度,而不会增加模型尺寸;特别是,我们在数据增强(DA)的框架内,系统地审查DR的对比学习;我们的研究显示,通用DA做法,例如使用基因化模型的查询增强和使用交叉编码创建假比重标签,往往效率低,而且不够理想;因此,我们提出了新的DA办法,有不同的查询和监督来源,以逐步培训一个通用的DR。 结果,我们经过不同增强力培训的密集检索器DRAGON,是第一个在受监督和零照的评审中实现状态技术效益的基线式DR(CERT),甚至与使用更复杂的晚期互动模式进行竞争(CERVER2和DARC+17)。