Load Balancing with Job-Size Testing: Performance Improvement or Degradation? (Load Balancing with Job-Size Testing: Performance Improvement or Degradation?)

In the context of decision making under explorable uncertainty, scheduling with testing is a powerful technique used in the management of computer systems to improve performance via better job-dispatching decisions. Upon job arrival, a scheduler may run some \emph{testing algorithm} against the job to extract some information about its structure, e.g., its size, and properly classify it. The acquisition of such knowledge comes with a cost because the testing algorithm delays the dispatching decisions, though this is under control. In this paper, we analyze the impact of such extra cost in a load balancing setting by investigating the following questions: does it really pay off to test jobs? If so, under which conditions? Under mild assumptions connecting the information extracted by the testing algorithm in relationship with its running time, we show that whether scheduling with testing brings a performance degradation or improvement strongly depends on the traffic conditions, system size and the coefficient of variation of job sizes. Thus, the general answer to the above questions is non-trivial and some care should be considered when deploying a testing policy. Our results are achieved by proposing a load balancing model for scheduling with testing that we analyze in two limiting regimes. When the number of servers grows to infinity in proportion to the network demand, we show that job-size testing actually degrades performance unless short jobs can be predicted reliably almost instantaneously \emph{and} the network load is sufficiently high. When the coefficient of variation of job sizes grows to infinity, we construct testing policies inducing an arbitrarily large performance gain with respect to running jobs untested.

翻译：基于任务大小测试的负载平衡：性能改善还是恶化？在可探索的不确定性决策下，调度与测试是一种强大的技术，在计算机系统管理中用于通过更好的任务调度决策来提高性能。在任务到达时，调度程序可以使用一些“测试算法”来提取有关其结构（例如大小）的一些信息并正确分类它。获取此类知识是有成本的，因为测试算法会延迟调度决策，尽管这是可控的。本文分析了在负载均衡环境中进行此额外成本的影响，通过研究以下问题来调度是否真正节省开支？如果是这样，在什么条件下？在连接测试算法中提取的信息与其运行时间存在关系的温和假设的前提下，我们表明，无论调度与测试是否带来性能下降或改善，它非常依赖于流量条件、系统大小和任务大小的变异系数。因此，对于上述问题的一般回答不是平凡的，当部署测试策略时应谨慎考虑。我们的结果是通过提出一个用于调度测试的负载平衡模型来实现的，我们在两个极限制度下对其进行了分析。当服务器数与网络负载成比例增加时，我们表明，任务大小测试实际上会导致性能下降，除非可以可靠并几乎瞬间地预测短任务，而且网络负载足够高。当任务大小的变异系数趋近于无穷大时，我们构造了测试策略，相对于未经测试的运行任务，可引入任意大的性能收益。