Conversational search (CS) plays a vital role in information retrieval. The current state of the art approaches the task using a multi-stage pipeline comprising conversational query reformulation and information seeking modules. Despite of its effectiveness, such a pipeline often comprises multiple neural models and thus requires long inference times. In addition, independently optimizing the effectiveness of each module does not consider the relation between modules in the pipeline. Thus, in this paper, we propose a single-stage design, which supports end-to-end training and low-latency inference. To aid in this goal, we create a synthetic dataset for CS to overcome the lack of training data and explore different training strategies using this dataset. Experiments demonstrate that our model yields competitive retrieval effectiveness against state-of-the-art multi-stage approaches but with lower latency. Furthermore, we show that improved retrieval effectiveness benefits the downstream task of conversational question answering.
翻译:连通搜索(CS)在信息检索中发挥着至关重要的作用。 最新的最新状态是使用由谈话查询重订和信息寻找模块组成的多阶段管道来完成这项任务。 尽管这种管道具有效力,但它往往包含多个神经模型,因此需要很长的推论时间。 此外,独立优化每个模块的有效性并不考虑管道中模块之间的关系。 因此,我们在本文件中提议一个单阶段设计,支持端到端的培训和低长的推论。 为了帮助实现这一目标,我们为 CS 创建一个合成数据集,以克服缺乏培训数据的问题,并探索使用该数据集的不同培训战略。 实验表明,我们的模型能产生与最先进的多阶段方法相比具有竞争性的检索效力,但使用较低的耐久性。 此外,我们表明,改进的检索效力有利于对话问题回答的下游任务。