The theory of divide-and-conquer parallelization has been well-studied in the past, providing a solid basis upon which to explore different approaches to the parallelization of merge sort in Python. Python's simplicity and extensive selection of libraries make it the most popular scientific programming language, so it is a fitting language in which to implement and analyze these algorithms. In this paper, we use Python packages multiprocessing and mpi4py to implement several different parallel merge sort algorithms. Experiments are conducted on an academic supercomputer, upon which benchmarks are performed using Cloudmesh. We find that hybrid multiprocessing merge sort outperforms several other algorithms, achieving a 1.5x speedup compared to the built-in Python sorted() and a 34x speedup compared to sequential merge sort. Our results provide insight into different approaches to implementing parallel merge sort in Python and contribute to the understanding of general divide-and-conquer parallelization in Python on both shared and distributed memory systems.
翻译:过去曾对分化和分化平行化理论进行了深入的研究,为探索在Python 中平行化类型的不同方法提供了一个坚实的基础。 Python 的简单性和广泛的图书馆选择使其成为最受欢迎的科学编程语言,因此它是实施和分析这些算法的适当语言。在本文中,我们使用 Python 软件包的多处理和 mpi4py 来实施几种不同的平行合并算法。实验是在一个学术超级计算机上进行的,并在此基础上使用Cloudmesh 执行基准。我们发现,混合多处理集成比其他几种算法更符合其他算法,与内建的 Python 分类() 相比,实现了1.5x速度,与相继合并类型相比,达到了34x速度。我们的结果为在Python 中实施平行合并分类的不同方法提供了深刻的见解,有助于理解在共享和分布的记忆系统上对 Python 进行一般分解平行化平行化。