Serverless computing is increasingly being used for parallel computing, which have traditionally been implemented as stateful applications. Executing complex, burst-parallel, directed acyclic graph (DAG) jobs poses a major challenge for serverless execution frameworks, which will need to rapidly scale and schedule tasks at high throughput, while minimizing data movement across tasks. We demonstrate that, for serverless parallel computations, decentralized scheduling enables scheduling to be distributed across Lambda executors that can schedule tasks in parallel, and brings multiple benefits, including enhanced data locality, reduced network I/Os, automatic resource elasticity, and improved cost effectiveness. We describe the implementation and deployment of our new serverless parallel framework, called Wukong, on AWS Lambda. We show that Wukong achieves near-ideal scalability, executes parallel computation jobs up to 68.17x faster, reduces network I/O by multiple orders of magnitude, and achieves 92.96% tenant-side cost savings compared to numpywren.
翻译:无服务器计算正在越来越多地用于平行计算,传统上,这些计算是作为有声化的应用而实施的。 执行复杂、爆裂和定向的环形图(DAG)任务给无服务器执行框架带来了重大挑战,因为无服务器执行框架需要快速扩展和排期高传输量的任务,同时将不同任务的数据移动最小化。 我们证明,对于不服务器的平行计算,分散的日程安排可以使能够同时安排任务的兰巴达执行者分布在不同的日程上,并带来多种好处,包括增加数据地点、减少网络一/O、自动资源弹性和成本效益的提高。 我们描述了我们新的无服务器平行框架(在AWS Lambda上称为Wukong)的落实和部署情况。 我们显示,Wukong实现了接近理想的可缩放性,执行平行计算任务速度高达68.17x的更快,将网络一/O减少多个数量级,并实现与numpywren相比的92.96%的租户端成本节约。