In High Energy Physics facilities that provide High Performance Computing environments provide an opportunity to efficiently perform the statistical inference required for analysis of data from the Large Hadron Collider, but can pose problems with orchestration and efficient scheduling. The compute architectures at these facilities do not easily support the Python compute model, and the configuration scheduling of batch jobs for physics often requires expertise in multiple job scheduling services. The combination of the pure-Python libraries pyhf and funcX reduces the common problem in HEP analyses of performing statistical inference with binned models, that would traditionally take multiple hours and bespoke scheduling, to an on-demand (fitting) "function as a service" that can scalably execute across workers in just a few minutes, offering reduced time to insight and inference. We demonstrate execution of a scalable workflow using funcX to simultaneously fit 125 signal hypotheses from a published ATLAS search for new physics using pyhf with a wall time of under 3 minutes. We additionally show performance comparisons for other physics analyses with openly published probability models and argue for a blueprint of fitting as a service systems at HPC centers.
翻译:在提供高性能计算机环境的高能物理设施中,提供高性能计算环境的高能物理设施为高效地进行分析大型强子对撞机数据所需的统计推断提供了机会,但可能会对管弦化和高效时间安排造成问题。这些设施的计算结构不易支持Python计算模型,物理分批工作的配置安排往往需要多种工作时间安排服务方面的专业知识。纯-Python图书馆pyhf和funcX的结合减少了高能实验分析中常见的问题,即用被捆绑模型进行统计推断,这通常需要多小时时间并进行发言安排,即时(安装)“服务功能”可在几分钟内在工人中间按需执行,为洞察和推断提供更短的时间。我们演示了使用可缩放的工作流程,以便同时使用已出版的 ATLAS 搜索使用pyhf的新物理学的125个信号假体与短3分钟的墙时段。我们还展示了其他物理学分析的性能比较,以公开公布的概率模型作为HPC中心服务系统的蓝图。