This paper presents a parallel solution based on the coarse-grained multicomputer (CGM) model using the four-splitting technique to solve the optimal binary search tree problem. The well-known sequential algorithm of Knuth solves this problem in $\mathcal{O}\left(n^2\right)$ time and space, where $n$ is the number of keys used to build the optimal binary search tree. To parallelize this algorithm on the CGM model, the irregular partitioning technique, consisting in subdividing the dependency graph into subgraphs (or blocks) of variable size, has been proposed to tackle the trade-off of minimizing the number of communication rounds and balancing the load of processors. This technique however induces a high latency time of processors (which accounts for most of the global communication time) because varying the blocks' sizes does not enable them to start evaluating some blocks as soon as the data they need are available. The four-splitting technique proposed in this paper solves this shortcoming by evaluating a block as a sequence of computation and communication steps of four subblocks. This CGM-based parallel solution requires $\mathcal{O}\left(n^2/\sqrt{p} \right)$ execution time with $\mathcal{O}\left( k \sqrt{p}\right)$ communication rounds, where $p$ is the number of processors and $k$ is the number of times the size of blocks is subdivided. An experimental study conducted to evaluate the performance of this CGM-based parallel solution showed that compared to the solution based on the irregular partitioning technique where the speedup factor is up to $\times$10.39 on one hundred and twenty-eight processors with 40960 keys when $k = 2$, the speedup factor of this solution is up to $\times$13.12 and rises up to $\times$14.93 when $k = 5$.
翻译:本文提出了一个基于粗糙的多计算机( CGM) 模型的平行解决方案。 使用 4 分解 技术来解决最佳的二进制搜索树问题。 Knuth 众所周知的连续算法在 $\ mathcal{ O\\ left (n\ 2\ right) 时间和空间中解决了这个问题, $n是用于构建最佳二进制搜索树的密钥数。 要在 CGM 模型上平行使用这个算法, 包括将依赖图下调为 $ $ 2 的分层( 或区块) 。 提议了不规则的分区技术, 包括将 $ 60 的 $ 。 $ $ 。 $ $ 美元 的调值 。 { { { { { { r\ r\ r\ r\ 程 时间, 此方法导致处理器的高度时间( 占全球通信时间的大部分) ), 因为区块大小无法在它们需要数据的时候开始对某些区块进行评估 。 。 这个 CGGM_\\\\\\\\\\\ Qrock 时间 时间的计算 数字的计算 速度的计算方法的计算方法 。