Just like the scientific data they generate, simulation workflows for research should be findable, accessible, interoperable, and reusable (FAIR). However, while significant progress has been made towards FAIR data, the majority of science and engineering workflows used in research remain poorly documented and often unavailable, involving ad hoc scripts and manual steps, hindering reproducibility and stifling progress. We introduce Sim2Ls (pronounced simtools) and the Sim2L Python library that allow developers to create and share end-to-end computational workflows with well-defined and verified inputs and outputs. The Sim2L library makes Sim2Ls, their requirements, and their services discoverable, verifies inputs and outputs, and automatically stores results in a globally-accessible simulation cache and results database. This simulation ecosystem is available in nanoHUB, an open platform that also provides publication services for Sim2Ls, a computational environment for developers and users, and the hardware to execute runs and store results at no cost. We exemplify the use of Sim2Ls using two applications and discuss best practices towards FAIR simulation workflows and associated data.
翻译:正如它们产生的科学数据一样,模拟研究工作流程应当可以找到、可访问、互操作和可再使用(FAIR)。然而,虽然在FAIR数据方面取得了重大进展,但用于研究的大多数科学和工程工作流程仍然记录不足,而且往往无法获得,涉及临时脚本和人工步骤,妨碍复制和扼杀进展。我们引入了Sim2Ls(宣传的Simtools)和Sim2L Python图书馆,使开发者能够创建和分享终端到终端的计算工作流程,并配有定义明确和经核实的投入和产出。Sim2L图书馆使Sim2Ls、其要求及其服务可以发现、核查投入和产出,以及自动储存在一个全球可访问的模拟缓存和结果数据库中产生结果。这个模拟生态系统存在于纳米HUHB,一个开放的平台,它也为Sim2Ls提供出版服务,为开发者和用户提供计算环境,以及免费执行运行和储存结果的硬件。我们用两个应用程序举例说明了Sim2Ls的使用,并讨论了用于FAIR模拟工作流程和相关数据的最佳做法。