We consider load balancing in large-scale heterogeneous server systems in the presence of data locality that imposes constraints on which tasks can be assigned to which servers. The constraints are naturally captured by a bipartite graph between the servers and the dispatchers handling assignments of various arrival flows. When a task arrives, the corresponding dispatcher assigns it to a server with the shortest queue among $d\geq 2$ randomly selected servers obeying the above constraints. Server processing speeds are heterogeneous and they depend on the server-type. For a broad class of bipartite graphs, we characterize the limit of the appropriately scaled occupancy process, both on the process-level and in steady state, as the system size becomes large. Using such a characterization, we show that data locality constraints can be used to significantly improve the performance of heterogeneous systems. This is in stark contrast to either heterogeneous servers in a full flexible system or data locality constraints in systems with homogeneous servers, both of which have been observed to degrade the system performance. Extensive numerical experiments corroborate the theoretical results.
翻译:我们考虑大型不同服务器系统中的负载平衡,因为数据位置对分配给哪些服务器的任务有限制。这些限制自然地通过服务器和处理各种抵达流程的调度员之间的双边图表来捕捉。当任务到来时,相应的调度员将它指派给一个服务器,其队列最短,排队在$d\geq 2美元中,随机选择的服务器遵守上述限制。服务器处理速度是异质的,它们取决于服务器类型。对于一大批两边图表来说,随着系统规模的扩大,我们将适当规模的占用过程的局限性描述在进程级别和稳定状态上。我们用这种特征来显示数据位置限制可以用来大大改善不同系统的性能。这与完全灵活的系统中的多式服务器或同质服务器系统中的数据位置限制形成鲜明对比,两者都是观察到的,以降低系统性能。广泛的数字实验证实了理论结果。