We present a work-efficient parallel level-synchronous Breadth First Search (BFS) algorithm for shared-memory architectures which achieves the theoretical lower bound on parallel running time. The optimality holds regardless of the shape of the graph. We also demonstrate the implication of this optimality for the energy consumption of the program empirically. The key idea is never to use more processing cores than necessary to complete the work in any computation step efficiently. We keep the rest of the cores idle to save energy and to reduce other resource contentions (e.g., bandwidth, shared caches, etc). Our BFS does not use locks and atomic instructions and is easily extendible to shared-memory coprocessors.
翻译:我们为共享模拟结构提出了一个工作效率平行水平同步第一次搜索(BFS)算法,该算法在平行运行时间上实现了理论下限。 最佳性可以维持, 不论图形的形状如何。 我们还从经验上展示了这种最佳性对程序能源消耗的影响。 关键的想法是,在任何计算步骤中,要高效率地完成工作,处理核心永远不会超过必要的数量。 我们让其他核心保持闲置,以节省能源并减少其他资源争议( 如带宽、共享缓存等 ) 。 我们的 BFS 不使用锁和原子指令, 并且很容易推广到共享的模拟共处理器 。