We present the extension of the Tinker-HP package (Lagard\`ere et al., Chem. Sci., 2018,9, 956-972) to the use of Graphics Processing Unit (GPU) cards to accelerate molecular dynamics simulations using polarizable many-body force fields. The new high-performance module allows for an efficient use of single- and multi-GPU architectures ranging from research laboratories to modern supercomputer centers. After detailing an analysis of our general scalable strategy that relies on OpenACC and CUDA, we discuss the various capabilities of the package. Among them, the multi-precision possibilities of the code are discussed. If an efficient double precision implementation is provided to preserve the possibility of fast reference computations, we show that a lower precision arithmetic is preferred providing a similar accuracy for molecular dynamics while exhibiting superior performances. As Tinker-HP is mainly dedicated to accelerate simulations using new generation point dipole polarizable force field, we focus our study on the implementation of the AMOEBA model. Testing various NVIDIA platforms including 2080Ti, 3090, V100 and A100 cards, we provide illustrative benchmarks of the code for single- and multi-cards simulations on large biosystems encompassing up to millions of atoms. The new code strongly reduces time to solution and offers the best performances to date obtained using the AMOEBA polarizable force field. Perspectives toward the strong-scaling performance of our multi-node massive parallelization strategy, unsupervised adaptive sampling and large scale applicability of the Tinker-HP code in biophysics are discussed. The present software has been released in phase advance on GitHub in link with the High Performance Computing community COVID-19 research efforts and is free for Academics (see https://github.com/TinkerTools/tinker-hp).
翻译:我们展示了Tinker-HP软件包(Lagard ⁇ ere等人,Chem.Sci.,2018,9,956-972)的扩展,将Tinker-HP软件包(Lagard ⁇ ere等人,Chem.Sci.,2018,9,956-972)推广至图形处理股(GPU)卡的使用,以使用极分多体力场加速分子动态模拟。新的高性能模块允许高效使用从研究实验室到现代超级计算机中心的单一和多个GPUPS结构。在详细分析了我们依靠OpenACC和CUDA的可扩缩战略后,我们讨论了该软件的各种能力。其中讨论了该代码的多重精确可能性。如果提供高效的双精度实施,以保存快速参考计算的可能性。我们更精确的计算方法为分子动态提供了相似的精度,同时展示优性能。由于Tinker-HPHP,我们的研究重点是利用新的生成点调调调调极分数, 我们的研究重点是AMOIDIA模型模型模型应用各种包括2080T、3090、V100和A100级的高级模拟模型的模型的高级智能数据流流流流化的快速化的高级性能,我们使用Mexalalal-dealalalalalalalalalalal-deal-deal-deal-deal-deal-deal-deal-deal-deal-dealmamaxxxxxxx,我们提供了在新数到百万个的高级的高级的高级智能的高级的高级的高级的高级的高级的高级的高级的模型。