We consider the problem of semantic matching in product search: given a customer query, retrieve all semantically related products from a huge catalog of size 100 million, or more. Because of large catalog spaces and real-time latency constraints, semantic matching algorithms not only desire high recall but also need to have low latency. Conventional lexical matching approaches (e.g., Okapi-BM25) exploit inverted indices to achieve fast inference time, but fail to capture behavioral signals between queries and products. In contrast, embedding-based models learn semantic representations from customer behavior data, but the performance is often limited by shallow neural encoders due to latency constraints. Semantic product search can be viewed as an eXtreme Multi-label Classification (XMC) problem, where customer queries are input instances and products are output labels. In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products. We consider hierarchical linear models with n-gram features for fast real-time inference. Quantitatively, our method maintains a low latency of 1.25 milliseconds per query and achieves a 65% improvement of Recall@100 (60.9% v.s. 36.8%) over a competing embedding-based DSSM model. Our model is robust to weight pruning with varying thresholds, which can flexibly meet different system requirements for online deployments. Qualitatively, our method can retrieve products that are complementary to existing product search system and add diversity to the match set.
翻译:我们考虑的是产品搜索中的语义匹配问题: 给客户查询, 从大小为1亿或以上的巨大目录中检索所有与语义相关的产品。 由于巨大的目录空间和实时延迟限制, 语义匹配算法不仅希望高回顾, 而且需要低纬度。 常规词汇匹配方法( 例如, Okapi- BM25) 利用反向指数来实现快速引用时间, 但未能捕捉查询和产品之间的行为信号 。 相比之下, 嵌入的模型从客户行为数据中学习语义表达, 但性能往往受到浅线性线性读数的限制 。 语义产品搜索可以被视为一个 eXtreme 多标签分类( XMC) 问题, 客户查询是输入实例, 产品是输出标签 。 在本文中, 我们的目标是通过基于树基的 XMC 模型来改进语义模型的模型搜索, 在产品数量中, 直线性变异性变异性变异性数据。 我们考虑有 n- 直线性模型, 在 N- 25 级搜索中, Q- s recrecreal roal rodeal 。