Semantic search for candidate retrieval is an important yet neglected problem in retrieval-based Chatbots, which aims to select a bunch of candidate responses efficiently from a large pool. The existing bottleneck is to ensure the model architecture having two points: 1) rich interactions between a query and a response to produce query-relevant responses; 2) ability of separately projecting the query and the response into latent spaces to apply efficiently in semantic search during online inference. To tackle this problem, we propose a novel approach, called Multitask-based Semantic Search Neural Network (MSSNN) for candidate retrieval, which accomplishes query-response interactions through multi-tasks. The method employs a Seq2Seq modeling task to learn a good query encoder, and then performs a word prediction task to build response embeddings, finally conducts a simple matching model to form the dot-product scorer. Experimental studies have demonstrated the potential of the proposed approach.
翻译:在基于检索的聊天室中,对候选人检索的语义搜索是一个重要但被忽视的问题,目的是从大库中高效率地选择一系列候选响应。现有的瓶颈是确保模型结构有两个点:(1) 查询和答复之间的丰富互动,以产生与查询有关的答复;(2) 单独预测查询和对潜在空间的反应的能力,以便在在线推断中有效地应用语义搜索。为了解决这一问题,我们提议了一种新颖的方法,称为多塔什基的语义搜索神经网络(MSSNN),用于候选人检索,通过多任务完成查询-反应互动。该方法使用Seq2Seq模型任务学习良好的查询编码器,然后执行字词预测任务,以构建响应嵌入,最后进行简单的匹配模型以形成点数计分器。实验研究已经展示了拟议方法的潜力。