Search trees on trees (STTs) generalize the fundamental binary search tree (BST) data structure: in STTs the underlying search space is an arbitrary tree, whereas in BSTs it is a path. An optimal BST of size $n$ can be computed for a given distribution of queries in $O(n^2)$ time [Knuth 1971] and centroid BSTs provide a nearly-optimal alternative, computable in $O(n)$ time [Mehlhorn 1977]. By contrast, optimal STTs are not known to be computable in polynomial time, and the fastest constant-approximation algorithm runs in $O(n^3)$ time [Berendsohn, Kozma 2022]. Centroid trees can be defined for STTs analogously to BSTs, and they have been used in a wide range of algorithmic applications. In the unweighted case (i.e., for a uniform distribution of queries), a centroid tree can be computed in $O(n)$ time [Brodal et al. 2001; Della Giustina et al. 2019]. These algorithms, however, do not readily extend to the weighted case. Moreover, no approximation guarantees were previously known for centroid trees in either the unweighted or weighted cases. In this paper we revisit centroid trees in a general, weighted setting, and we settle both the algorithmic complexity of constructing them, and the quality of their approximation. For constructing a weighted centroid tree, we give an output-sensitive $O(n\log h)\subseteq O(n\log n)$ time algorithm, where $h$ is the height of the resulting centroid tree. If the weights are of polynomial complexity, the running time is $O(n\log\log n)$. We show these bounds to be optimal, in a general decision tree model of computation. For approximation, we prove that the cost of a centroid tree is at most twice the optimum, and this guarantee is best possible, both in the weighted and unweighted cases. We also give tight, fine-grained bounds on the approximation-ratio for bounded-degree trees and on the approximation-ratio of more general $\alpha$-centroid trees.
翻译:在树上搜索树(STTs) 常规化基本二进制搜索树( BST) 数据结构 : 在STTs 中, 基础搜索空间是任意的树, 而在BSTs 中, 它是一个路径。 最优的 BST 规模为$O (n) 2, 时间( Knuth 1971), 中间的BST 提供了一种接近最佳的替代方法, 以 $ (n) 时间( Mehlhorn 1977) 进行比较。 相比之下, 最佳的 STTs 在复合时间里, 基础搜索空间是任意的, 基础的搜索空间是任意的。 最佳的 OSTSTs 以美元( n) 时间为基础, 基础的搜索空间是任意的 Order- dirmax 时间值 。 然而, 之前的Oral- kills 的计算结果, 在一般的算法中, 普通的算法中, 我们的算法是正常的, 时间, 我们的算值是正常的。