Nearest Neighbor Search (NNS) over generalized weighted distances is fundamental to a wide range of applications. The problem of NNS over the generalized weighted square Euclidean distance has been studied in previous work. However, numerous studies have shown that the Manhattan distance could be more effective than the Euclidean distance for high-dimensional NNS, which indicates that the generalized weighted Manhattan distance is possibly more practical than the generalized weighted square Euclidean distance in high dimensions. To the best of our knowledge, no prior work solves the problem of NNS over the generalized weighted Manhattan distance in sublinear time. This paper achieves the goal by proposing two novel hashing schemes ($d_w^{l_1},l_2$)-ALSH and ($d_w^{l_1},\theta$)-ALSH.
翻译:近邻搜索(NNS)超过普遍加权距离是广泛应用的基础。 在以前的工作中,已经研究过全重加权欧几里德距离的NNS问题。然而,许多研究表明,曼哈顿距离比高维NNS的欧几里德距离更有效,这表明,曼哈顿普遍加权距离可能比高维度普遍加权欧几里德距离更为实用。据我们所知,此前没有一项工作解决NNS问题,超过曼哈顿亚线性超重加权距离的问题。本文件通过提出两种新型的集水计划(d_w ⁇ l_1},l_2美元)-ALSH和(d_wül_1},\theta$)-ALSH,实现了目标。