The use of ML methods to dynamically steer ensemble-based simulations promises significant improvements in the performance of scientific applications. We present DeepDriveMD, a tool for a range of prototypical ML-driven HPC simulation scenarios, and use it to quantify improvements in the scientific performance of ML-driven ensemble-based applications. We discuss its design and characterize its performance. Motivated by the potential for further scientific improvements and applicability to more sophisticated physical systems, we extend the design of DeepDriveMD to support stream-based communication between simulations and learning methods. It demonstrates a 100x speedup to fold proteins, and performs 1.6x more simulations per unit time, improving resource utilization compared to the sequential framework. Experiments are performed on leadership-class platforms, at scales of up to O(1000) nodes, and for production workloads. We establish DeepDriveMD as a high-performance framework for ML-driven HPC simulation scenarios, that supports diverse simulation and ML back-ends, and which enables new scientific insights by improving length- and time-scale accessed.
翻译:利用ML方法动态地引导共制模拟,将科学应用的性能显著改善。我们展示了DeptDriveMD,这是一个用于各种原型ML驱动HPC模拟情景的工具,用于量化ML驱动共制应用的科学性能的改进。我们讨论了ML驱动共制应用的设计及其性能特征。我们以进一步科学改进的潜力和适用于更先进的物理系统为动力,将DeepDriveMD的设计扩大到支持模拟和学习方法之间的流基交流。它展示了折叠蛋白质的100x加速度,每单位时间进行1.6x以上的模拟,比顺序框架改进资源利用。实验在领导阶层平台上进行,在O(1 000节点)到O(1 000节点)的尺度上进行,并针对生产工作量进行。我们建立了DeepDriveMD,作为ML驱动的HPC模拟情景的一个高性能框架,用于支持多种模拟和ML后端,通过改进时间段访问使新的科学洞察力成为新的科学洞察力。