Increasingly, scientific discovery requires sophisticated and scalable workflows. Workflows have become the ``new applications,'' wherein multi-scale computing campaigns comprise multiple and heterogeneous executable tasks. In particular, the introduction of AI/ML models into the traditional HPC workflows has been an enabler of highly accurate modeling, typically reducing computational needs compared to traditional methods. This chapter discusses various modes of integrating AI/ML models to HPC computations, resulting in diverse types of AI-coupled HPC workflows. The increasing need of coupling AI/ML and HPC across scientific domains is motivated, and then exemplified by a number of production-grade use cases for each mode. We additionally discuss the primary challenges of extreme-scale AI-coupled HPC campaigns -- task heterogeneity, adaptivity, performance -- and several framework and middleware solutions which aim to address them. While both HPC workflow and AI/ML computing paradigms are independently effective, we highlight how their integration, and ultimate convergence, is leading to significant improvements in scientific performance across a range of domains, ultimately resulting in scientific explorations otherwise unattainable.
翻译:科学发现日益需要复杂和可扩展的工作流程。工作流程已成为“新应用 ”, 其多尺度计算运动包括多重和多样化的可执行任务。 特别是,将AI/ML模型引入传统的高常委会工作流程,是高度精确建模的促成因素,通常比传统方法减少计算需求。本章讨论了将AI/ML模型纳入高常委会计算的各种模式,从而产生了不同类型的AI-混合的HPC工作流程。在科学领域将AI/ML和HPC结合起来的日益需要得到了激励,然后以每种模式的生产级使用案例为范例。我们进一步讨论了极端规模的AI/ML组合的HPC运动的主要挑战 -- -- 任务差异性、适应性、性、性能 -- 以及旨在解决这些问题的若干框架和中等软件解决方案。虽然高常委会工作流程和AI/ML计算模式都是独立有效的,但我们强调这些工作流程的整合和最终趋同如何导致一系列领域的科学绩效的显著改善,最终导致科学探索无法实现。