Advances in high-throughput simulation (HTS) software enabled computational databases and big data to become common resources in materials science. However, while computational power is increasingly larger, software packages orchestrating complex workflows in heterogeneous environments are scarce. This paper introduces mkite, a Python package for performing HTS in distributed computing environments. The mkite toolkit is built with the server-client pattern, decoupling production databases from client runners. When used in combination with message brokers, mkite enables any available client to perform calculations without prior hardware specification on the server side. Furthermore, the software enables the creation of complex workflows with multiple inputs and branches, facilitating the exploration of combinatorial chemical spaces. Software design principles are discussed in detail, highlighting the usefulness of decoupling simulations and data management tasks to diversify simulation environments. To exemplify how mkite handles simulation workflows of combinatorial systems, case studies on zeolite synthesis and surface catalyst discovery are provided. Finally, key differences with other atomistic simulation workflows are outlined. The mkite suite can enable HTS in distributed computing environments, simplifying workflows with heterogeneous hardware and software, and helping deployment of calculations at scale.
翻译:高通量模拟(HTS)软件的进步使计算数据库和大数据成为材料科学中常见的资源。然而,尽管计算能力越来越大,用于编排异构环境中复杂工作流的软件包很少。本文介绍了mkite,这是一个用于在分布式计算环境中执行HTS的Python包。mkite工具包采用服务器-客户端模式构建,将生产数据库与客户端运行程序解耦。当与消息代理结合使用时,mkite使得任何可用的客户端都可以在无需事先指定服务器端硬件的情况下执行计算。此外,该软件还可以创建具有多个输入和分支的复杂工作流程,促进组合化学空间的探索。本文详细讨论了软件设计原则,重点介绍了将模拟和数据管理任务脱钩以使模拟环境多样化的有用性。为了说明mkite如何处理组合系统的模拟工作流程,提供了关于沸石合成和表面催化剂发现的案例研究。最后,概述了与其他原子模拟工作流程的关键差异。mkite工具套件可以在分布式计算环境中启用HTS,在异构硬件和软件环境的工作流程方面简化了计算,并有助于大规模部署计算。