As more applications are being moved to the Cloud thanks to serverless computing, it is increasingly necessary to support the native life cycle execution of those applications in the data center. But existing cloud orchestration systems either focus on short-running workflows (like IBM Composer or Amazon Step Functions Express Workflows) or impose considerable overheads for synchronizing massively parallel jobs (Azure Durable Functions, Amazon Step Functions). None of them are open systems enabling extensible interception and optimization of custom workflows. We present Triggerflow: an extensible Trigger-based Orchestration architecture for serverless workflows. We demonstrate that Triggerflow is a novel serverless building block capable of constructing different reactive orchestrators (State Machines, Directed Acyclic Graphs, Workflow as code, Federated Learning orchestrator). We also validate that it can support high-volume event processing workloads, auto-scale on demand with scale down to zero when not used, and transparently guarantee fault tolerance and efficient resource usage when orchestrating long running scientific workflows.
翻译:由于没有服务器的计算,越来越多的应用程序被移到云层,因此越来越有必要支持在数据中心执行这些应用程序的本地生命周期。但现有的云管系统要么关注短期工作流程(如IBM Compater或亚马逊步态函数快车流),要么为同步大规模平行工作投入大量间接费用(Azure Convention 函数、亚马逊步子函数),没有一个是能够扩展拦截和优化自定义工作流程的开放系统。我们介绍了Trigerflow:一个可以扩展的基于触发的无服务器工作流程操作结构。我们证明, Trigggerflow是一个没有服务器的新构件,能够构建不同的反应式管弦(国家机器、定向环绕图、工作流代码、联合学习管弦) 。我们还证实,它能够支持大量事件处理工作量,在未使用时对需求进行自动规模小至零的处理,并在管理长期科学工作流程时透明地保证错误容忍性和高效的资源使用。