Optimizing the performance of computational fluid dynamics (CFD) applications accelerated by graphics processing units (GPUs) is crucial for efficient simulations. In this study, we employed a machine learning-based autotuning technique to optimize 14 key parameters related to GPU kernel scheduling, including the number of thread blocks and threads within a block. Our approach utilizes fully connected neural networks as the underlying machine learning model, with the tuning parameters as inputs to the neural networks and the actual execution time of a simulation as the outputs. To assess the effectiveness of our autotuning approach, we conducted experiments on three different types of GPUs, with computational speeds ranging from low to high. We performed independent training for each GPU model and also explored combined training across multiple GPU models. By leveraging artificial neural networks, our autotuning technique achieved remarkable results in tuning a wide range of parameters, leading to enhanced performance for a CFD code. Importantly, our approach demonstrated its efficacy while requiring only a small fraction of samples from the large parameter search space. This efficiency is attributed to the effectiveness of the fully connected neural networks in capturing the complex relationships between the parameter settings and the resulting performance. Overall, our study showcases the potential of machine learning, specifically fully connected neural networks, in autotuning GPU-accelerated CFD codes. By leveraging this approach, researchers and practitioners can achieve high performance in scientific simulations with optimized parameter configurations.
翻译:暂无翻译