Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in a scaling regime where the total number of servers in the system becomes large and meanwhile both the system load and the number of servers that a job needs scale with the total number of servers. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first consider the commonly used First-Come-First-Serve (FCFS) policy and characterize the exact order of its mean waiting time. We then prove a lower bound on the mean waiting time of all policies, and demonstrate that there is an order gap between this lower bound and the mean waiting time under FCFS. We finally complement the lower bound with an achievability result: we show that under a priority policy that we call P-Priority, the mean waiting time achieves the order of the lower bound. This achievability result implies the tightness of the lower bound, the asymptotic optimality of P-Priority, and the strict suboptimality of FCFS.
翻译:多服务ererer 任务在今天的计算组中普遍存在, 即同时使用多个服务器的工作。 但对于多服务er 任务的系统延迟性能知之甚少。 我们考虑在一个规模化的系统中, 将多服务erer 任务的模型排在队列中, 因为系统中服务器的总数变得很大, 同时系统负荷和服务器的数量都与服务器的总数相匹配。 先前的工作在这个规模化制度中, 排队概率的上限已经得出。 但是, 没有适当的下限, 现有结果无法用来区分政策。 在本文中, 我们研究延迟性表现的方式是, 在多服务erererer 任务的平均等待时间上, 确定一个清晰的界限。 在一个规模化的系统中, 等待的时间是排在排队而不是服务的时间。 我们首先考虑一个常用的系统负荷, 并描述其平均等待时间的准确顺序。 我们随后证明在所有政策的平均等待时间中存在较低的约束, 并且表明在FCFSFS下的平均等待时间间隔和平均等待时间间隔之间存在着一个顺序差距。 我们最后用一个更低的底底线来补充了我们最下最下最短的缓的等待时间性, 。 我们这个最短的等待的结果。