Developing complex biomolecular workflows is not always straightforward. It requires tedious developments to enable the interoperability between the different biomolecular simulation and analysis tools. Moreover, the need to execute the pipelines on distributed systems increases the complexity of these developments. To address these issues, we propose a methodology to simplify the implementation of these workflows on HPC infrastructures. It combines a library, the BioExcel Building Blocks (BioBBs), that allows scientists to implement biomolecular pipelines as Python scripts, and the PyCOMPSs programming framework which allows to easily convert Python scripts into task-based parallel workflows executed in distributed computing systems such as HPC clusters, clouds, containerized platforms, etc. Using this methodology, we have implemented a set of computational molecular workflows and we have performed several experiments to validate its portability, scalability, reliability and malleability.
翻译:开发复杂的生物分子工作流程并非总能直截了当,它需要一些乏味的发展,使不同的生物分子模拟和分析工具之间能够互操作性。此外,在分布式系统中执行管道的必要性增加了这些发展的复杂性。为解决这些问题,我们提出了简化高氯碱基础设施中这些工作流程的实施的方法。它将图书馆、生物Excel Building bluts(BioBBs)(BioBBs)(BioBBs)(BioBs)(Bython builts)(Bython)(ByCOMPS)(PyCOMPS)(PyCOMPS)(PyCOMPS)(Python)(Python)编程框架(Python)(Python)(Python)(Prison)编程框架(Python)(Python) 编程结合起来,使分布式计算机系统(HPC(HPC) 集群、云体、集装箱平台等以任务为基础的平行流程)执行的平行工作流程得以轻易转换成。我们采用了一系列计算分子工作流程,并进行了若干实验,我们用这种方法验证其可移动性、可移动性、可移动性、可容、可变、可容、可容、可容、可容、可容、可靠和可容、可容、可容、可容、可容、可容、可容、可容、可容性、可容、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性、可容性和易性、可容性、可容性。