This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model. CLEAR explicitly trains the neural embedding to encode language structures and semantics that lexical retrieval fails to capture with a novel residual-based embedding learning method. Empirical evaluations demonstrate the advantages of CLEAR over state-of-the-art retrieval models, and that it can substantially improve the end-to-end accuracy and efficiency of reranking pipelines.
翻译:本文介绍了CLEAR,这是一个检索模型,旨在用神经嵌入匹配模型的语义匹配信号来补充古典词汇精密匹配模型,如BM25。 CLEAR明确培训用于编码语言结构和语义的神经嵌入,而词汇检索未能用新的残余嵌入学习方法来捕捉这些语言结构和语义。 经验性评估表明CLEAR比最新检索模型具有优势,并且可以大幅提高排位管道端到端的准确性和效率。