Enumerating simple cycles has important applications in computational biology, network science, and financial crime analysis. In this work, we focus on parallelising the state-of-the-art simple cycle enumeration algorithms by Johnson and Read-Tarjan along with their applications to temporal graphs. To our knowledge, we are the first ones to parallelise these two algorithms in a fine-grained manner. We are also the first to demonstrate experimentally a linear performance scaling. Such a scaling is made possible by our decomposition of long sequential searches into fine-grained tasks, which are then dynamically scheduled across CPU cores, enabling an optimal load balancing. Furthermore, we show that coarse-grained parallel versions of the Johnson and the Read-Tarjan algorithms that exploit edge- or vertex-level parallelism are not scalable. On a cluster of four multi-core CPUs with $256$ physical cores, our fine-grained parallel algorithms are, on average, an order of magnitude faster than their coarse-grained parallel counterparts. The performance gap between the fine-grained and the coarse-grained parallel algorithms widens as we use more CPU cores. When using all 256 CPU cores, our parallel algorithms enumerate temporal cycles, on average, $260\times$ faster than the serial algorithm of Kumar and Calders.
翻译:在计算生物学、网络科学和金融犯罪分析中,简单计算周期具有重要的应用。在这项工作中,我们侧重于将约翰逊和撒旦的先进简单循环计算算法及其应用与时间图相平行。据我们所知,我们是第一个以细微分解方式平行这两种算法的系统。我们也是第一个实验性线性表现缩放的实验性实验性实验。通过将长期连续搜索分解成细微重重力任务,然后在CPU核心之间动态排列,从而实现最佳的负载平衡。此外,我们展示了强生和撒扬双轨的粗略平行计算算法及其应用时间图的应用程序。根据我们所知,我们是第一个以细微细细微分辨的两种算法以细细细细微分级方式平行计算这两种算法的。在四个具有256美元物理核心的多核心CPU组中,我们微分辨的平行算法平均分级算法比其相近60的平行对等值要快得多。我们用微的Cgragrade-gradeal-cal 算法的Cral-cal-cal-cal-cal-ligal-tragal-tracal-tracal-tracal-slal-slal-slationslationslations