We present a provenance model for the generic workflow of numerical Lattice Quantum Chromodynamics (QCD) calculations, which constitute an important component of particle physics research. These calculations are carried out on the largest supercomputers worldwide with data in the multi-PetaByte range being generated and analyzed. In the Lattice QCD community, a custom metadata standard (QCDml) that includes certain provenance information already exists for one part of the workflow, the so-called generation of configurations. In this paper, we follow the W3C PROV standard and formulate a provenance model that includes both the generation part and the so-called measurement part of the Lattice QCD workflow. We demonstrate the applicability of this model and show how the model can be used to answer some provenance-related research questions. However, many important provenance questions in the Lattice QCD community require extensions of this provenance model. To this end, we propose a multi-layered provenance approach that combines prospective and retrospective elements.
翻译:我们提出了一个溯源模型,适用于数值量子色动力学(QCD)计算的通用工作流程,这些计算是粒子物理研究的重要组成部分。这些计算在世界上最大的超级计算机上进行,生成和分析的数据量是多 PB 级别的。在量子色动力学社区中,已经存在一个自定义元数据标准(QCDml),其中包括某些溯源信息,适用于一部分工作流程,即所谓的构型生成。在本文中,我们遵循 W3C PROV 标准,并制定了一个溯源模型,包括量子色动力学工作流程的生成和测量部分。我们演示了该模型的适用性,并展示了该模型如何用于回答一些与溯源相关的研究问题。然而,量子色动力学社区中许多重要的溯源问题需要扩展此溯源模型。为此,我们提出了多层溯源方法,结合了预测性和回溯性元素。