HPC is an enabling platform for AI. The introduction of AI workloads in the HPC applications basket has non-trivial consequences both on the way of designing AI applications and on the way of providing HPC computing. This is the leitmotif of the convergence between HPC and AI. The formalized definition of AI pipelines is one of the milestones of HPC-AI convergence. If well conducted, it allows, on the one hand, to obtain portable and scalable applications. On the other hand, it is crucial for the reproducibility of scientific pipelines. In this work, we advocate the StreamFlow Workflow Management System as a crucial ingredient to define a parametric pipeline, called "CLAIRE COVID-19 Universal Pipeline," which is able to explore the optimization space of methods to classify COVID-19 lung lesions from CT scans, compare them for accuracy, and therefore set a performance baseline. The universal pipeline automatizes the training of many different Deep Neural Networks (DNNs) and many different hyperparameters. It, therefore, requires a massive computing power, which is found in traditional HPC infrastructure thanks to the portability-by-design of pipelines designed with StreamFlow. Using the universal pipeline, we identified a DNN reaching over 90% accuracy in detecting COVID-19 lesions in CT scans.
翻译:HPC 是一个有利于AI的平台。 在 HPC 应用篮子中引入 AI 工作量在设计 AI 应用程序和提供 HPC 计算的方式上都具有非三重性后果。 这是 HPC 和 AI 之间趋同的主导点。 AI 管道的正式定义是 HPC- AI 趋同的里程碑之一。 如果操作得当, 一方面允许获得可移植和可缩放的应用。 另一方面, 它对科学管道的再复制至关重要。 在这项工作中, 我们提倡将 StreamFlow 工作流程管理系统作为界定参数管道的关键成分, 称为“ CLARIRE COVID-19 Universal Pipleline ”, 这个系统能够探索将COVID-19肺损伤从HPC扫描中分类的方法的最佳空间, 比较其准确性, 并因此设定了性能基准。 通用管道使许多不同的深神经网络(DNNUS) 和许多超光度计的培训自动化。 因此, 需要巨大的计算能力, 在传统输油管- NPRC 中, 设计出一个可探测性能 90 。