The in-memory approximate nearest neighbor search (ANNS) algorithms have achieved great success for fast high-recall query processing, but are extremely inefficient when handling hybrid queries with unstructured (i.e., feature vectors) and structured (i.e., related attributes) constraints. In this paper, we present HQANN, a simple yet highly efficient hybrid query processing framework which can be easily embedded into existing proximity graph-based ANNS algorithms. We guarantee both low latency and high recall by leveraging navigation sense among attributes and fusing vector similarity search with attribute filtering. Experimental results on both public and in-house datasets demonstrate that HQANN is 10x faster than the state-of-the-art hybrid ANNS solutions to reach the same recall quality and its performance is hardly affected by the complexity of attributes. It can reach 99\% recall@10 in just around 50 microseconds On GLOVE-1.2M with thousands of attribute constraints.
翻译:近距离近距离搜索( ANNS) 模拟算法在快速高回调查询处理方面取得了巨大成功, 但在处理非结构化( 特征矢量) 和结构化( 相关属性) 限制的混合查询时效率极低。 在本文中, 我们介绍一个简单而高效的混合查询处理框架, 这个框架可以很容易地嵌入现有的基于近距离图形的ANNS 算法中。 我们通过在属性之间调用导航感和将矢量相似的搜索与属性过滤混在一起, 保证了低延度和高回想起。 公共和内部数据集的实验结果表明, HQANN 速度比最先进的混合ANNS 混合解决方案快10x, 以达到相同的回溯质量, 其性能几乎不会受到属性复杂性的影响。 在GLOVE-120M 上大约50 微秒内, 它可以达到 99\ 10 记得@ 10, 约50 微秒内有数千个属性限制 。