This paper studies the classical problem of finding all $k$ nearest neighbors to points of a query set $Q$ in another reference set $R$ within any metric space. The well-known work by Beygelzimer, Kakade, and Langford in 2006 introduced cover trees and claimed to guarantee a near linear time complexity in the size $|R|$ of the reference set for $k=1$. Our previous work defined compressed cover trees and corrected the key arguments for $k\geq 1$ and previously unknown challenging data cases. In 2009 Ram, Lee, March, and Gray attempted to improve the time complexity by using pairs of cover trees on the query and reference sets. In 2015 Curtin with the above co-authors used extra parameters to finally prove a similar complexity for $k = 1$. Our work fills all previous gaps and substantially improves the neighbor search based on pairs of new compressed cover trees. The novel imbalance parameter of paired trees allowed us to prove a better time complexity for any number of neighbors $k\geq 1$.
翻译:本文研究了将所有近邻的美元寻找到另一个基准点,设定为$Q$的查询点的典型问题。 2006年Beygelzimer、Kakade和Langford的著名工作引入了覆盖树,并声称可以保证近线性时间复杂性,其大小相当于$k=1的参考值。我们以前的工作定义了压缩覆盖树,纠正了$k\geq 1美元和以前未知的具有挑战性的数据案例的关键论点。 2009年,Ram, Lee, March和Gray试图通过在查询和参考数据集上使用覆盖树来提高时间复杂性。2015年,Curtin与上述共同作者一起使用额外参数,最终证明美元=1美元的复杂程度。我们的工作填补了所有以前的差距,大大改进了邻居对新压缩覆盖树的搜索。新组合树的新不平衡参数让我们证明任何邻居的时间复杂性更高。