Scientific workflow management systems (SWMSs) and resource managers together ensure that tasks are scheduled on provisioned resources so that all dependencies are obeyed, and some optimization goal, such as makespan minimization, is achieved. In practice, however, there is no clear separation of scheduling responsibilities between an SWMS and a resource manager because there exists no agreed-upon separation of concerns between their different components. This has two consequences. First, the lack of a standardized API to exchange scheduling information between SWMSs and resource managers hinders portability. It incurs costly adaptations when a component should be replaced by a different one (e.g., an SWMS with another SWMS on the same resource manager). Second, due to overlapping functionalities, current installations often actually have two schedulers, both making partial scheduling decisions under incomplete information, leading to suboptimal workflow scheduling. In this paper, we propose a simple REST interface between SWMSs and resource managers, which allows any SWMS to pass dynamic workflow information to a resource manager, enabling maximally informed scheduling decisions. We provide an implementation of this API as an example, using Nextflow as an SWMS and Kubernetes as a resource manager. Our experiments with nine real-world workflows show that this strategy reduces makespan by up to 25.1% and 10.8% on average compared to the standard Nextflow/Kubernetes configuration. Furthermore, a more widespread implementation of this API would enable leaner code bases, a simpler exchange of components of workflow systems, and a unified place to implement new scheduling algorithms.
翻译:科学工作流程管理系统(SWMS)和资源管理者共同确保将任务安排在提供的资源上,以便遵守所有依赖性,并实现某种优化目标,例如最大限度地缩小规模;然而,在实践中,SWMS与资源管理者之间没有明确区分时间安排责任,因为没有商定将不同组成部分之间的关切分开。这有两种后果。首先,SWMS与资源管理者之间缺乏一个标准化的API以交换时间安排信息会妨碍可移动性。当某个组成部分被不同的组成部分取代时,它将带来代价高昂的工作流程调整(例如,SWMS与另一个SWMS一起,在同一资源管理者之间实现最大程度的优化目标)。第二,由于功能重叠,目前设施实际上往往有两个调度者,既在不完整的信息下作出部分时间安排决定,导致次优化的工作流程安排。在本文件中,我们建议SWMS与资源管理者之间建立一个简单的REST接口,使任何SWMS将动态工作流程信息传递给资源管理者,从而能够最充分地做出知情的时间安排决定。我们提供这一APIPI的落实情况,并用一个真实的汇率模型来进行真正的版本,而将NWMS 10的流程显示我们的资源流流流流的进度,从而显示我们的资源流的进度将使得10的运行系统向下一个系统向下一个系统进行。</s>