A Java parallel streams implementation of the $K$-nearest neighbor descent algorithm is presented using a natural statistical termination criterion. Input data consist of a set $S$ of $n$ objects of type V, and a Function<V, Comparator<V>>, which enables any $x \in S$ to decide which of $y, z \in S\setminus\{x\}$ is more similar to $x$. Experiments with the Kullback-Leibler divergence Comparator support the prediction that the number of rounds of $K$-nearest neighbor updates need not exceed twice the diameter of the undirected version of a random regular out-degree $K$ digraph on $n$ vertices. Overall complexity was $O(n K^2 \log_K(n))$ in the class of examples studied. When objects are sampled uniformly from a $d$-dimensional simplex, accuracy of the $K$-nearest neighbor approximation is high up to $d = 20$, but declines in higher dimensions, as theory would predict.
翻译:使用自然统计终止标准展示了以美元为最近邻血统算法的 Java 平行流。 输入数据包括一套固定的美元, 美元为五类对象, 以及一个函数 < V, 比较器 < V ⁇, 使任何美元x 美元以内 Ssetminus ⁇ x ⁇ ⁇ ⁇ $更接近x 美元。 与 Kullback- Leiber 差异的实验 比较器支持这样的预测, 即每轮美元以内最近邻更新的美元数量不需要超过随机常规外度美元直径的两倍。 总体复杂性在所研究的例子类别中是$O( n K ⁇ 2\ log_ K( n)) 。 当对象从 $d- size 简单x 进行统一取样时, 美元的最近邻近似值的精确度高达 $d = 20 美元, 但根据理论预测, 其尺寸会下降 。