Recently, computers have diversified architectures. To achieve high numerical calculation software performance, it is necessary to tune the software according to the target computer architecture. However, code optimization for each environment is difficult unless it is performed by a specialist who knows computer architectures well. By applying autotuning (AT), the tuning effort can be reduced. Optimized implementation by AT that enhances computer performance can be used even by non-experts. In this research, we propose a technique for AT for programs using open multi-processing (OpenMP). We propose an AT method using an AT language that changes the OpenMP optimized loop and dynamically changes the number of threads in OpenMP according to computational kernels. Performance evaluation was performed using the Fujitsu PRIMEHPC FX100, which is a K-computer type supercomputer installed at the Information Technology Center, Nagoya University. As a result, we found there was a performance increase of 1.801 times that of the original code in a plasma turbulence analysis.
翻译:暂无翻译