Extracting actionable information from data sources such as the Linac Coherent Light Source (LCLS-II) and Advanced Photon Source Upgrade (APS-U) is becoming more challenging due to the fast-growing data generation rate. The rapid analysis possible with ML methods can enable fast feedback loops that can be used to adjust experimental setups in real-time, for example when errors occur or interesting events are detected. However, to avoid degradation in ML performance over time due to changes in an instrument or sample, we need a way to update ML models rapidly while an experiment is running. We present here a data service and model service to accelerate deep neural network training with a focus on ML-based scientific applications. Our proposed data service achieves 100x speedup in terms of data labeling compare to the current state-of-the-art. Further, our model service achieves up to 200x improvement in training speed. Overall, fairDMS achieves up to 92x speedup in terms of end-to-end model updating time.
翻译:从Linac Coherent光源(LCLS-II)和高级光源升级(APS-U)等数据源中提取可采取行动的信息,由于数据生成速度的快速增长,正变得越来越具有挑战性。利用ML方法进行快速分析,可以快速反馈回路,用于实时调整实验设置,例如当出现错误或发现有趣的事件时。然而,为了避免由于仪器或样本的变化而使ML性能逐渐退化,我们需要在实验进行期间迅速更新ML模型。我们在此提供数据服务和模型服务,以加速深神经网络培训,重点是基于ML的科学应用。我们提议的数据服务在数据标签与当前最新工艺相比方面实现了100x速度的加速。此外,我们的模型服务在培训速度方面实现了多达200x的改进。总体而言,公平DMS在终端至终端模型更新时间方面达到了92x的速度。