We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality-of-service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions which guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments which support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns.
翻译:我们认为,这是一个大型服务系统,需要立即向许多平行服务器库中的一个库发送接收的任务。用户所看到的性能随着同时的任务数量减少而下降,调度员的目标是通过简单的门槛政策平衡负荷,从而最大限度地提高总体服务质量。我们证明,这种政策在流体和传播尺度上是最佳的,而仅仅涉及一个对大规模部署至关重要的小型通信间接费用。然而,为了最理想地确定门槛,重要的是要了解系统负荷,而这种负荷可能并不为人所知。为此,我们设计了一种控制规则,以在线方式调整门槛值。我们提出一些条件,保证这一适应性门槛值与到这一时刻的估计值保持最佳状态。此外,我们提供数字实验,支持理论结果,并进一步表明我们的政策有效地适应了时间变化的需求模式。