Models of parallel processing systems typically assume that one has $l$ workers and jobs are split into an equal number of $k=l$ tasks. Splitting jobs into $k > l$ smaller tasks, i.e. using ``tiny tasks'', can yield performance and stability improvements because it reduces the variance in the amount of work assigned to each worker, but as $k$ increases, the overhead involved in scheduling and managing the tasks begins to overtake the performance benefit. We perform extensive experiments on the effects of task granularity on an Apache Spark cluster, and based on these, developed a four-parameter model for task and job overhead that, in simulation, produces sojourn time distributions that match those of the real system. We also present analytical results which illustrate how using tiny tasks improves the stability region of split-merge systems, and analytical bounds on the sojourn and waiting time distributions of both split-merge and single-queue fork-join systems with tiny tasks. Finally we combine the overhead model with the analytical models to produce an analytical approximation to the sojourn and waiting time distributions of systems with tiny tasks which include overhead. Though no longer strict analytical bounds, these approximations matched the Spark experimental results very well in both the split-merge and fork-join cases.
翻译:平行处理系统模式通常假定,一个人有1美元工人,而工作被分成同等数量的工作,相当于1美元任务。将工作分为1美元以上小任务,即使用“小任务”,可以提高业绩和稳定性,因为这样可以减少分配给每个工人的工作量差异,但随着美元的增加,安排和管理任务所需的间接费用开始超过业绩效益。我们就任务颗粒对阿帕奇火花集群的影响进行了广泛的实验,并在此基础上开发了任务和工作间接费用的4个参数模型,在模拟中,产生与实际系统相匹配的隔间时间分布。我们还提出分析结果,说明如何使用微小的任务改善分集聚系统的稳定区域,并分析分集集和处理任务的等待时间分布。最后,我们将管理模型与分析模型结合起来,以产生对任务和工作间接费用和工作间接费用的分析近距离模型,在模拟中产生与实际系统相匹配的间隔时间分布。我们提出的分析结果显示,使用微任务如何改善分聚系统的稳定区域,同时分析分聚和等待有小任务的单筒和单筒组合系统的时间分布。我们把分析模型与分析模型结合起来,将分析结果与更精确地分析,同时进行。