Using tiny, equal-sized tasks (Homogeneous microTasking, HomT) has long been regarded an effective way of load balancing in parallel computing systems. When combined with nodes pulling in work upon becoming idle, HomT has the desirable property of automatically adapting its load distribution to the processing capacities of participating nodes - more powerful nodes finish their work sooner and, therefore, pull in additional work faster. As a result, HomT is deemed especially desirable in settings with heterogeneous (and possibly possessing dynamically changing) processing capacities. However, HomT does have additional scheduling and I/O overheads that might make this load balancing scheme costly in some scenarios. In this paper, we first analyze these advantages and disadvantages of HomT. We then propose an alternative load balancing scheme - Heterogeneous MacroTasking (HeMT) - wherein workload is intentionally partitioned according to nodes' processing capacity. Our goal is to study when HeMT is able to overcome the performance disadvantages of HomT. We implement a prototype of HeMT within the Apache Spark application framework with complementary enhancements to the Apache Mesos cluster manager. Spark's built-in scheduler, when parameterized appropriately, implements HomT. Our experimental results show that HeMT out-performs HomT when accurate workload-specific estimates of nodes' processing capacities are learned. As representative results, Spark with HeMT offers about 10% better average completion times for realistic data processing workloads over the default system.
翻译:使用微小的、同等规模的任务( HomT) 长期以来一直被认为是平行计算系统中一种有效的负载平衡方法。 当HomT在闲置后与节点拉动工作时, 将自动调整其负载分配以适应参与节点的处理能力---- 更强大的节点更快地完成工作, 从而更快地拉动额外工作。 因此, HomT被认为特别适合在具有多种( 可能具有动态变化的) 处理能力的环境中。 然而, HomT确实有额外的时间安排和 I/ O 间接费用, 这可能使这个负平衡计划在某些情景中变得昂贵。 在本文中, 我们首先分析HomT的这些优缺点。 然后我们提出一个其他的负载平衡方案 - 超常的宏图( HemT) 。 当 HemT 能够克服 HomT 的性能劣势时, 我们的目标是在 Apache Spark 应用框架内安装一个HEMT 模型, 并补充对 Aggest Meos 类组的精度处理能力进行补充。 当HMT 测试时, 测试显示我们10 平均的进度时, 的进度显示我们完成结果, 当HMT 完成时间时, 他的进度显示的进度, 我们的Smmt- hash- hexmt