Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path between all pairs of vertices in a graph and is employed in areas like communication networking, traffic routing, bioinformatics, among others. However, FW is computationally and spatially expensive since it requires O(n^3) operations and O(n^2) memory space. As the graph gets larger, parallel computing becomes necessary to provide a solution in an acceptable time range. In this paper, we studied a FW code developed for Xeon Phi KNL processors and adapted it to run on any Intel x86 processors, losing the specificity of the former. To do so, we verified one by one the optimizations proposed by the original code, making adjustments to the base code where necessary, and analyzing its performance on two Intel servers under different test scenarios. In addition, a new optimization was proposed to increase the concurrency degree of the parallel algorithm, which was implemented using two different synchronization mechanisms. The experimental results show that all optimizations were beneficial on the two x86 platforms selected. Last, the new optimization proposal improved performance by up to 23%.
翻译:暂无翻译