Nearest neighbor search is fundamental to a wide range of applications. Since the exact nearest neighbor search suffers from the "curse of dimensionality", approximate approaches, such as Locality-Sensitive Hashing (LSH), are widely used to trade a little query accuracy for a much higher query efficiency. In many scenarios, it is necessary to perform nearest neighbor search under multiple weighted distance functions in high-dimensional spaces. This paper considers the important problem of supporting efficient approximate nearest neighbor search for multiple weighted distance functions in high-dimensional spaces. To the best of our knowledge, prior work can only solve the problem for the $l_2$ distance. However, numerous studies have shown that the $l_p$ distance with $p\in(0,2)$ could be more effective than the $l_2$ distance in high-dimensional spaces. We propose a novel method, WLSH, to address the problem for the $l_p$ distance for $p\in(0,2]$. WLSH takes the LSH approach and can theoretically guarantee both the efficiency of processing queries and the accuracy of query results while minimizing the required total number of hash tables. We conduct extensive experiments on synthetic and real data sets, and the results show that WLSH achieves high performance in terms of query efficiency, query accuracy and space consumption.
翻译:近邻搜索是一系列广泛应用的基础。 由于最近的近邻搜索具有“ 维度的诅咒”, 近邻搜索具有“ 维度的诅咒”, 近似方法( 如本地- 敏感散列( LSH) ) 被广泛用来交换查询精度, 以便提高查询效率。 在许多情形中, 必须在高维空间的多个加权距离功能下进行近邻搜索。 本文考虑了支持近邻高效近邻搜索高维度空间多个加权距离功能的重要问题。 根据我们所知, 先前的工作只能解决“ 维度” 问题。 然而, 许多研究表明, 美元( 0. 2 美元) 的远方( $- p) 可以比高维空间的远点( $_ 2 美元) 更有效。 我们提出了一个新的方法, 即 WLSH, 来解决 $_ p$ ( 0. 2 美元) 的距离问题。 WLSH 采取LSH 方法, 理论上可以保证处理查询的效率和查询结果的准确性, 同时又尽量减少所需的总数量。 我们进行高度的合成LSLS 和高空间查询 。 我们进行高度的合成的实验, 和高度 。