This report highlights our work on improving GPU parallelization by supporting compute nodes with multiple GPUs. However, since the default support for multi-GPUs in OpenACC is limited[6], the current implementation allows each MPI process to access only a single GPU. Thus, the only way to take full advantage of multi-GPU nodes in the current version is to launch multiple processes, which increases resource contention. We investigated the benefits of having only one process offload to all available GPU devices.
翻译:本报告着重介绍我们通过支持以多个 GPU 计算节点来改进 GPU 平行化的工作。 但是,由于 OpenACC 中多个 GPU 的默认支持有限[6], 目前的实施允许每个 MPI 进程只访问一个 GPU 。 因此, 充分利用当前版本中多个 GPU 节点的唯一办法是启动多个进程, 这会增加资源争议。 我们调查了只将一个进程卸载到所有可用的 GPU 设备的好处 。