Deep prompt tuning (DPT) has gained great success in most natural language processing~(NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning~(FT) still dominates. When deploying multiple retrieval tasks using the same backbone model~(e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: each new retrieval model needs to repeatedly deploy the backbone model without reuse. To reduce the deployment cost in such a scenario, this work investigates applying DPT in dense retrieval. The challenge is that directly applying DPT in dense retrieval largely underperforms FT methods. To compensate for the performance drop, we propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining, as a general approach that could be compatible with any pre-trained language model and retrieval task. The experimental results show that the proposed method (called DPTDR) outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions. We also conduct ablation studies to examine the effectiveness of each strategy in DPTDR. We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources. Our code is available at https://github.com/tangzhy/DPTDR.
翻译:深度快速调试(DPT)在大多数自然语言处理(NLP)任务中取得了巨大成功。 但是,在微调((FT)仍然占主导地位的地方,在密集的检索中并没有很好地调查,在微调(FT)仍然占主导地位的地方,在密集的检索中直接应用DPT。在使用同一个主干模型~(如ROBERTA)时,基于FT的方法是不友好的:每个新的检索模型都需要反复部署主干模式而无需再用。在这样的情况下,这项工作调查了在密集的检索中应用DPT的部署费用。 挑战在于,在密集的检索中直接应用DPT, 基本上不完善FT方法。为了弥补性下降,我们建议为DPT的检索者采用两种模式-Ancicti和任务-ncial-noral-norstal-screal 战略,即以检索为主干燥的中间培训和统一负采矿方法,作为与任何预先训练的语文模型和检索任务兼容的一般方法。实验结果表明,拟议的方法(称为DPTDPTDDDDDDR)超越了先前在MS-MCO和自然问题中的状态-MT-MDRUPDRPD和自然-IPIPL的部署中所用的标准模式。我们相信每一项工作效率,我们每一个的每一个工作的每一项工作都提高了工作成本。