Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some of the most significant discoveries of the last decade. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play a crucial role in the data-oriented and post-Moore's computing landscape as they democratize the application of cutting-edge research techniques, computationally intensive methods, and use of new computing platforms. As workflows continue to be adopted by scientific projects and user communities, they are becoming more complex. Workflows are increasingly composed of tasks that perform computations such as short machine learning inference, multi-node simulations, long-running machine learning model training, amongst others, and thus increasingly rely on heterogeneous architectures that include CPUs but also GPUs and accelerators. The workflow management system (WMS) technology landscape is currently segmented and presents significant barriers to entry due to the hundreds of seemingly comparable, yet incompatible, systems that exist. Another fundamental problem is that there are conflicting theoretical bases and abstractions for a WMS. Systems that use the same underlying abstractions can likely be translated between, which is not the case for systems that use different abstractions. More information: https://workflowsri.org/summits/technical
翻译:科学工作流程是现代科学计算的基石,它们支撑了过去十年中一些最重要的发现。许多工作流程具有很高的计算、存储和/或通信需求,因此必须在从大云层到即将推出的大规模高频平台等大型平台上执行。工作流程将在数据导向和后摩尔的计算景观中发挥关键作用,因为它们使尖端研究技术的应用、计算密集方法和新计算平台的使用民主化。随着工作流程继续由科学项目和用户群体采用,它们变得越来越复杂。工作流程越来越多地由诸如短机学习推论、多节模拟、长期运行的机器学习模型培训等大规模平台组成。工作流程将日益依赖包括CPUs以及GPUs和Aceraceration的多种结构。工作流程管理系统(WMS)目前被分割,对进入提出了巨大的障碍,因为数百个看上去相似但互不兼容的系统已经存在。另一个根本问题就是进行计算,而抽象的系统则可能使用不同的抽象系统。