We present an end-to-end automated workflow that uses large-scale remote compute resources and an embedded GPU platform at the edge to enable AI/ML-accelerated real-time analysis of data collected for x-ray ptychography. Ptychography is a lensless method that is being used to image samples through a simultaneous numerical inversion of a large number of diffraction patterns from adjacent overlapping scan positions. This acquisition method can enable nanoscale imaging with x-rays and electrons, but this often requires very large experimental datasets and commensurately high turnaround times, which can limit experimental capabilities such as real-time experimental steering and low-latency monitoring. In this work, we introduce a software system that can automate ptychography data analysis tasks. We accelerate the data analysis pipeline by using a modified version of PtychoNN -- an ML-based approach to solve phase retrieval problem that shows two orders of magnitude speedup compared to traditional iterative methods. Further, our system coordinates and overlaps different data analysis tasks to minimize synchronization overhead between different stages of the workflow. We evaluate our workflow system with real-world experimental workloads from the 26ID beamline at Advanced Photon Source and ThetaGPU cluster at Argonne Leadership Computing Resources.
翻译:我们提出了一种端到端的自动化工作流程,利用大规模远程计算资源和嵌入GPU平台的边缘,实现AI/ML加速的实时分析从相邻重叠扫描位置收集的X射线群扫描数据。群扫描是一种无透镜方法,通过同时数值反演大量相邻重叠扫描位置的衍射图案来成像样品。这种采集方法可以通过X射线和电子实现纳米尺度成像,但这通常需要非常大的实验数据集和相应高的周转时间,这可能限制实验能力,如实时实验控制和低延迟监测。在这项工作中,我们介绍了一种可以自动化群扫描数据分析任务的软件系统。我们通过使用PtychoNN的修改版本来加速数据分析流程--一种用于解决相位恢复问题的基于ML的方法,与传统迭代方法相比,显示出两个数量级的加速。此外,我们的系统协调并重叠不同的数据分析任务,以最小化工作流程不同阶段之间的同步开销。我们用来自Advanced Photon Source的26ID波束线和Argonne Leadership Computing Resources的ThetaGPU群集的真实实验负载评估了我们的工作流程系统。