Query reformulation is a key mechanism to alleviate the linguistic chasm of query in ad-hoc retrieval. Among various solutions, query reduction effectively removes extraneous terms and specifies concise user intent from long queries. However, it is challenging to capture hidden and diverse user intent. This paper proposes Contextualized Query Reduction (ConQueR) using a pre-trained language model (PLM). Specifically, it reduces verbose queries with two different views: core term extraction and sub-query selection. One extracts core terms from an original query at the term level, and the other determines whether a sub-query is a suitable reduction for the original query at the sequence level. Since they operate at different levels of granularity and complement each other, they are finally aggregated in an ensemble manner. We evaluate the reduction quality of ConQueR on real-world search logs collected from a commercial web search engine. It achieves up to 8.45% gains in exact match scores over the best competing model.
翻译:暂无翻译