The Join-the-Shortest-Queue (JSQ) load balancing scheme is known to minimise the average response time of jobs in homogeneous systems with identical servers. However, for {\em heterogeneous} systems with servers having different processing speeds, finding an optimal load balancing scheme remains an open problem for finite system sizes. Recently, for systems with heterogeneous servers, a variant of JSQ called the {\em Speed-Aware-Join-the-Shortest-Queue (SA-JSQ)} scheme has been shown to achieve asymptotic optimality as the number of servers $n$ tends to infinity and the arrival rate in the system normalised by the number of servers remains constant. Motivated by this result, in this paper, we investigate the performance of the SA-JSQ scheme for heterogeneous systems in the {\em Halfin-Whitt} traffic regime. Our analysis begins by establishing that appropriately scaled and centered version of the Markov process describing system dynamics weakly converges to a two-dimensional reflected {\em Ornstein-Uhlenbeck (OU) process}. We then show using {\em Stein's method} that the stationary distribution of the underlying Markov process converges to that of the OU process as the system size increases by establishing the validity of interchange of limits. Finally, through coupling with a suitably constructed system, we show that SA-JSQ asymptotically minimises the diffusion-scaled total number of jobs and the diffusion-scaled number of waiting jobs in the steady-state in the Halfin-Whitt regime among all policies which dispatch jobs based on queue lengths and server speeds.
翻译:暂无翻译