Product quantization (PQ) is a popular approach for maximum inner product search (MIPS), which is widely used in ad-hoc retrieval. Recent studies propose differentiable PQ, where the embedding and quantization modules can be trained jointly. However, there is a lack of in-depth understanding of appropriate joint training objectives; and the improvements over non-differentiable baselines are not consistently positive in reality. In this work, we propose Search-oriented Product Quantization (SoPQ), where a novel training objective MCL is formulated. With the minimization of MCL, query and key's matching probability can be maximized for the differentiable PQ. Besides, VCS protocol is designed to facilitate the minimization of MCL, and SQL is leveraged to relax the dependency on labeled data. Extensive experiments on 4 real-world datasets validate the effectiveness of our proposed methods.
翻译:产品量化(PQ)是最大内部产品搜索的流行方法,在临时检索中广泛使用。最近的研究建议了不同的PQ,其中嵌入模块和定量模块可以联合培训。然而,对适当的联合培训目标缺乏深入的了解;对非差别基线的改进在现实中并非始终是积极的。在这项工作中,我们提议了面向搜索的产品量化(SoPQ),其中提出了一个新的培训目标 MCL。随着最小化 MCL,查询和关键对等概率可以最大化。此外,VCS协议的设计是为了便利尽量减少MCL,而SQL则被用来放松对标签数据的依赖。关于4个真实世界数据集的广泛实验证实了我们拟议方法的有效性。