Query categorization is an essential part of query intent understanding in e-commerce search. A common query categorization task is to select the relevant fine-grained product categories in a product taxonomy. For frequent queries, rich customer behavior (e.g., click-through data) can be used to infer the relevant product categories. However, for more rare queries, which cover a large volume of search traffic, relying solely on customer behavior may not suffice due to the lack of this signal. To improve categorization of rare queries, we adapt the Pseudo-Relevance Feedback (PRF) approach to utilize the latent knowledge embedded in semantically or lexically similar product documents to enrich the representation of the more rare queries. To this end, we propose a novel deep neural model named \textbf{A}ttentive \textbf{P}seudo \textbf{R}elevance \textbf{F}eedback \textbf{Net}work (APRF-Net) to enhance the representation of rare queries for query categorization. To demonstrate the effectiveness of our approach, we collect search queries from a large commercial search engine, and compare APRF-Net to state-of-the-art deep learning models for text classification. Our results show that the APRF-Net significantly improves query categorization by 5.9\% on $F1@1$ score over the baselines, which increases to 8.2\% improvement for the rare (tail) queries. The findings of this paper can be leveraged for further improvements in search query representation and understanding.
翻译:查询分类是电子商务搜索中查询意向理解的一个基本部分。 常见查询分类的任务是在产品分类中选择相关的精细分类产品类别。 对于常见的询问, 丰富的客户行为( 例如点击- 通过数据) 可以用来推断相关产品类别。 但是, 对于更罕见的查询, 包括大量搜索流量, 仅仅依靠客户行为可能并不够充分, 由于缺乏这个信号。 为了改进稀有查询的分类, 我们调整Pseudo- RElance 反馈( PRF) 方法, 以便利用在语义或词汇上相似的产品文档中所包含的潜在知识来丰富更罕见查询的表述。 为此, 我们提议了一个叫\ textbf{A} 的新颖的深度线性模型, 包括: textb{P} eudo\ textbf{ { { 升度}\ textbf{F} { { wereadbackf_ text_ ral_ reservoration (APRF-Net) 方法, 用于进一步增加查询的原始查询, 和深层次分类。 为了显示我们搜索- ROdealation- translation_ real 的搜索方法, 我们的搜索, 我们的搜索- relegal 学习结果, 我们的搜索- relegal