Python has become a dominant programming language for emerging areas like Machine Learning (ML), Deep Learning (DL), and Data Science (DS). An attractive feature of Python is that it provides easy-to-use programming interface while allowing library developers to enhance performance of their applications by harnessing the computing power offered by High Performance Computing (HPC) platforms. Efficient communication is key to scaling applications on parallel systems, which is typically enabled by the Message Passing Interface (MPI) standard and compliant libraries on HPC hardware. mpi4py is a Python-based communication library that provides an MPI-like interface for Python applications allowing application developers to utilize parallel processing elements including GPUs. However, there is currently no benchmark suite to evaluate communication performance of mpi4py -- and Python MPI codes in general -- on modern HPC systems. In order to bridge this gap, we propose OMB-Py -- Python extensions to the open-source OSU Micro-Benchmark (OMB) suite -- aimed to evaluate communication performance of MPI-based parallel applications in Python. To the best of our knowledge, OMB-Py is the first communication benchmark suite for parallel Python applications. OMB-Py consists of a variety of point-to-point and collective communication benchmark tests that are implemented for a range of popular Python libraries including NumPy, CuPy, Numba, and PyCUDA. We also provide Python implementation for several distributed ML algorithms as benchmarks to understand the potential gain in performance for ML/DL workloads. Our evaluation reveals that mpi4py introduces a small overhead when compared to native MPI libraries. We also evaluate the ML/DL workloads and report up to 106x speedup on 224 CPU cores compared to sequential execution. We plan to publicly release OMB-Py to benefit Python HPC community.
翻译:Python 已经成为机器学习、深学习和数据科学等新兴领域的主要编程语言。 Python 具有吸引力的特征是,它提供方便使用的编程界面,同时允许图书馆开发者利用高性能计算平台提供的计算能力来提高应用程序的性能。高效的通信是扩展平行系统应用程序的关键,这通常是由信息传递接口(MPI)标准和HPC硬件合规图书馆所促成的。 mpi4PyPy是一个基于 Python 的通信库,它为 Python 应用程序提供了一个类似于 MPI 的界面,允许应用程序开发者使用包括 GPUps 的平行处理元素。然而,目前没有基准套来评估现代 HPC 系统中的 mpi4 和 Python 代码的通信性能。为了缩小这一差距,我们建议OMB- PyPpi Pyal Pal Py Pal Py Py Py Pyal Py 数据库向OyMB 数据库提供最佳的通信性能评估,作为我们MB 和MB 数据库的ML 数据库的同步测试。