A number of problems in parallel computing require reasoning about the dependency structure in parallel programs. For example, dynamic race detection relies on efficient "on-the-fly" determination of dependencies between sequential and parallel tasks (e.g. to quickly determine whether or not two memory accesses occur in parallel). Several solutions to this "parallel order maintenance" problem has been proposed, but they all have several drawbacks, including lack of provable bounds, high asymptotic or practical overheads, and poor support for parallel execution. In this paper, we present a solution to the parallel order maintenance problem that is provably efficient, fully parallel, and practical. Our algorithm -- called DePa -- represents a computation as a graph and encodes vertices in the graph with two components: a dag-depth and a fork-path. In this encoding, each query requires $O(f/\omega)$ work, where $f$ is the minimum dynamic nesting depth of the two vertices compared, and $\omega$ is the word-size. In the common case (where $f$ is small, e.g., less than 100), each query requires only a single memory lookup and a small constant number of bitwise instructions. Furthermore, graph maintenance at forks and joins requires only constant work, resulting in no asymptotic impact on overall work and span. DePa is therefore work-efficient and fully parallel.
翻译:平行计算中的一些问题需要推理平行程序的依赖性结构。 例如,动态种族检测依赖于高效的“ 即时” 确定连续和平行任务之间的依赖性(例如,快速确定是否同时出现两个内存存存取) 。 已经提出了解决“ 平行命令维护” 问题的若干办法, 但都有几个缺点, 包括缺乏可辨别的范围、 高无干扰或实际管理费用, 以及对平行执行的支持不足 。 本文中, 我们提出了一个平行种族检测的解决方案 。 我们的算法( 称为 DePa ) 代表了图表和编码的顶部的计算, 有两个组成部分: 深度和叉路徑 。 在此编码中, 每个查询都需要 $( f/\\ omega) 工作, 其中$f是两个脊椎的最低动态嵌顶深度。 相比之下, $\\ omga 值是字型维护问题 。 在普通案例中, $f$( ) 只需要一个不变的平面、 直径、 和直径的指令, 而不是一个小的平面, 。