Numba is a game-changing compiler for high-performance computing with Python. It produces machine code that runs outside of the single-threaded Python interpreter and that fully utilizes the resources of modern CPUs. This means support for parallel multithreading and auto vectorization if available, as with compiled languages such as C++ or Fortran. In this article we document our experience developing PyExaFMM, a multithreaded Numba implementation of the Fast Multipole Method, an algorithm with a non-linear data structure and a large amount of data organization. We find that designing performant Numba code for complex algorithms can be as challenging as writing in a compiled language.
翻译:Numba 是一种用于高性能计算的编译器,可与 Python 配合使用。它产生的机器代码可以在单线程 Python 解释器外运行,并充分利用现代 CPU 的资源。这意味着支持并行多线程和自动向量化(如果有可用),就像使用编译语言(例如 C++ 或 Fortran)一样。在本文中,我们记录了开发 PyExaFMM 的经验:一种基于 Numba 的快速多极子方法(Fast Multipole Method)并行实现,该算法具有非线性数据结构和大量数据组织。我们发现,设计复杂算法的 Numba 代码的性能与编译语言的写作一样具有挑战性。