Numba is a game-changing compiler for high-performance computing with Python. It produces machine code that runs outside of the single-threaded Python interpreter and that fully utilizes the resources of modern CPUs. This means support for parallel multithreading and auto vectorization if available, as with compiled languages such as C++ or Fortran. In this article we document our experience developing PyExaFMM, a multithreaded Numba implementation of the Fast Multipole Method, an algorithm with a non-linear data structure and a large amount of data organization. We find that designing performant Numba code for complex algorithms can be as challenging as writing in a compiled language.
翻译:Numba是一种革命性的编译器,用于Python中的高性能计算。它生成的机器代码可以在单线程Python解释器之外运行,并且充分利用现代CPU的资源。这意味着支持并行多线程和自动矢量化(如果有的话),就像使用C++或Fortran编译的语言一样。在本文中,我们记录了开发PyExaFMM的经验,这是一种多线程Numba实现的快速多极方法。该方法具有非线性数据结构并且需要大量的数据组织。我们发现,为复杂算法设计高效的Numba代码与使用编译语言编写代码一样具有挑战性。