Context. A novel high-performance exact pair counting toolkit called Fast Correlation Function Calculator (FCFC) is presented, which is publicly available at https://github.com/cheng-zhao/FCFC. Aims. As the rapid growth of modern cosmological datasets, the evaluation of correlation functions with observational and simulation catalogues has become a challenge. High-efficiency pair counting codes are thus in great demand. Methods. We introduce different data structures and algorithms that can be used for pair counting problems, and perform comprehensive benchmarks to identify the most efficient ones for real-world cosmological applications. We then describe the three levels of parallelisms used by FCFC -- including SIMD, OpenMP, and MPI -- and run extensive tests to investigate the scalabilities. Finally, we compare the efficiency of FCFC against alternative pair counting codes. Results. The data structures and histogram update algorithms implemented in FCFC are shown to outperform alternative methods. FCFC does not benefit much from SIMD as the bottleneck of our histogram update algorithm is mostly cache latency. Nevertheless, the efficiency of FCFC scales well with the numbers of OpenMP threads and MPI processes, albeit the speedups may be degraded with over a few thousand threads in total. FCFC is found to be faster than most (if not all) other public pair counting codes for modern cosmological pair counting applications.
翻译:由于现代宇宙数据集的迅速增长,对与观测和模拟目录的关联功能的评估已成为一项挑战。因此,对高效对数代码的需求很大。方法。我们引入了不同的数据结构和算法,可用于对对数问题进行配对,并采用全面基准来确定现实世界宇宙应用中最有效的数据。我们然后在https://github.com/cheng-zha/FCFCFC上公开介绍FCFC使用的三种平行级别,包括SIMD、OpenMP和MPI,并进行广泛的测试以调查其大小。最后,我们将FCFCFCF与其它配对代码的效率进行比较。结果。在FCFCFC实施的数据结构和直方更新算法显示为不完善的替代方法。FCFCFCFD没有多少好处,因为我们SIMD的更新算法的瓶码大多是缓存的。然而,FCFCFCFCFCFCFS的最大效率可能比FCFCFCFCFC总速度要快得多,而更接近于CFCFCFCFCFCFCFCFS总的速度也比CFCFCFS的缩缩算。