工作量分配执行评价 (Evaluating Distributed Execution of Workloads)

Resource selection and task placement for distributed execution poses conceptual and implementation difficulties. Although resource selection and task placement are at the core of many tools and workflow systems, the methods are ad hoc rather than being based on models. Consequently, partial and non-interoperable implementations proliferate. We address both the conceptual and implementation difficulties by experimentally characterizing diverse modalities of resource selection and task placement. We compare the architectures and capabilities of two systems: the AIMES middleware and Swift workflow scripting language and runtime. We integrate these systems to enable the distributed execution of Swift workflows on Pilot-Jobs managed by the AIMES middleware. Our experiments characterize and compare alternative execution strategies by measuring the time to completion of heterogeneous uncoupled workloads executed at diverse scale and on multiple resources. We measure the adverse effects of pilot fragmentation and early binding of tasks to resources and the benefits of backfill scheduling across pilots on multiple resources. We then use this insight to execute a multi-stage workflow across five production-grade resources. We discuss the importance and implications for other tools and workflow systems.

翻译：虽然资源选择和任务安排是许多工具和工作流程系统的核心,但方法却是临时性的,而不是基于模式。因此,部分和非互操作性执行激增。我们通过实验性地确定资源选择和任务安排的不同模式来解决概念和执行方面的困难。我们比较了两个系统的架构和能力:AIMES中软件和Swift工作流程编稿语言和运行时间。我们将这些系统结合起来,以便能够在由AIMES中软件管理的试点-作业中执行Swift工作流程。我们的实验通过测量完成不同规模和多种资源完成的杂交、未混杂的工作量的时间来描述和比较备选执行战略。我们衡量试点分散和任务与资源早期捆绑在一起的不利影响,以及将多个资源纳入试点的优势。我们然后利用这种洞察力在五个生产级资源中执行多阶段工作流程。我们讨论了其他工具和工作流程系统的重要性和所涉问题。

相关内容

Middleware

关注 0

International Middleware会议是讨论中间件设计、构造和使用方面的重要创新和最新进展的论坛。中间件是位于应用程序和底层平台（操作系统；数据库；硬件）之间的分布式系统软件，和/或将分布式应用程序、数据库或设备连接在一起。它的主要作用是协调和实现不同层或组件之间的通信，同时将分布的大部分复杂性隔离为一个单一的、经过充分测试和理解的系统抽象。官网链接：http://www.middleware-conference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日