Advances in high-throughput simulation (HTS) software enabled computational databases and big data to become common resources in materials science. However, while computational power is increasingly larger, software packages orchestrating complex workflows in heterogeneous environments are scarce. This paper introduces mkite, a Python package for performing HTS in distributed computing environments. The mkite toolkit is built with the server-client pattern, decoupling production databases from client runners. When used in combination with message brokers, mkite enables any available client to perform calculations without prior hardware specification on the server side. Furthermore, the software enables the creation of complex workflows with multiple inputs and branches, facilitating the exploration of combinatorial chemical spaces. Software design principles are discussed in detail, highlighting the usefulness of decoupling simulations and data management tasks to diversify simulation environments. To exemplify how mkite handles simulation workflows of combinatorial systems, case studies on zeolite synthesis and surface catalyst discovery are provided. Finally, key differences with other atomistic simulation workflows are outlined. The mkite suite can enable HTS in distributed computing environments, simplifying workflows with heterogeneous hardware and software, and helping deployment of calculations at scale.
翻译:高通量模拟(HTS)软件的进步使得计算数据库和大数据能够成为材料科学的共同资源。然而,虽然计算能力日益增大,但在不同环境中协调复杂工作流程的软件包却很少。本文介绍了用于在分布式计算环境中执行HTS的“Python”软件包Mkite。Mkite工具包是用服务器-客户模式建造的,将生产数据库与客户运行者分离开来。当与信息经纪人一起使用时,Mkite使任何可用的客户能够在服务器方面无需事先硬件规格的情况下进行计算。此外,该软件使具有多种投入和分支的复杂工作流程得以创建,便利了对组合式化学空间的探索。详细讨论了软件设计原则,突出强调了将模拟和数据管理任务脱钩以多样化环境的效用。为例,演示了Mkite如何处理组合式系统模拟工作流程,提供了关于热利合成和表面催化剂发现的案例研究。最后,概述了与其他原子模拟工作流程的关键差异。Mkite软件套件可以在分布式计算机环境中帮助应用HTSTS,简化硬件和软件的计算。