We present a modern C++17-compatible thread pool implementation, built from scratch with high-performance scientific computing in mind. The thread pool is implemented as a single lightweight and self-contained class, and does not have any dependencies other than the C++17 standard library, thus allowing a great degree of portability. In particular, our implementation does not utilize OpenMP or any other high-level multithreading APIs, and thus gives the programmer precise low-level control over the details of the parallelization, which permits more robust optimizations. The thread pool was extensively tested on both AMD and Intel CPUs with up to 40 cores and 80 threads. This paper provides motivation, detailed usage instructions, performance tests, and the full annotated source code.
翻译:我们提出了一个现代化的C++17兼容的线条库实施,从零开始,以高性能科学计算为思想而构建。线条库是一个单一的轻量级和自足级,除了C++17标准图书馆之外没有任何依赖性,因此可以大量移动。特别是,我们的实施没有利用OpenMP或任何其他高级多读API,因此对平行的细节给予精确的低水平控制,从而可以进行更强有力的优化。 线条库在AMD和英特尔的CPU上进行了广泛的测试,有多达40个核心和80个线索。 本文提供了动力、详细的使用指示、性能测试和完整的附加源代码。