用Stochactic 双线性奖项安排服务器 (Scheduling Servers with Stochastic Bilinear Rewards)

In this paper, we study scheduling in multi-class, multi-server queueing systems with stochastic rewards of job-server assignments following a bilinear model in feature vectors characterizing jobs and servers. A bilinear model allows capturing pairwise interactions of features of jobs and servers. Our goal is regret minimization for the objective of maximizing cumulative reward of job-server assignments over a time horizon against an oracle policy that has complete information about system parameters, while maintaining queueing system stable and allowing for different job priorities. The scheduling problem we study is motivated by various applications including matching in online platforms, such as crowdsourcing and labour platforms, and cluster computing systems. We study a scheduling algorithm based on weighted proportionally fair allocation criteria augmented with marginal costs for reward maximization, along with a linear bandit algorithm for estimating rewards of job-server assignments. For a baseline setting, in which jobs have identical mean service times, we show that our algorithm has a sub-linear regret, as well as a sub-linear bound on the mean queue length, in the time horizon. We show that similar bounds hold under more general assumptions, allowing for mean service times to be different across job classes and a time-varying set of server classes. We also show stability conditions for distributed iterative algorithms for computing allocations, which is of interest in large-scale system applications. We demonstrate the efficiency of our algorithms by numerical experiments using both synthetic randomly generated data and a real-world cluster computing data trace.

翻译：在本文中,我们研究的是多级、多服务器的排队系统中的日程安排,这种排队系统中的工作服务器分配有随机的奖励,其依据是描述工作和服务器特点的特性矢量和服务器的双线性模式的双线性模式。双线性模式可以捕捉工作与服务器特征的对等互动。我们的目标是,为在时间跨度上最大限度地增加对工作服务器分配的累积奖励的目标而遗憾最小化。对于一个具有系统参数完整信息、同时保持排队系统稳定并允许不同工作优先次序的甲骨牌政策,我们研究的日程安排问题是由各种应用程序引起的,包括匹配在线平台,例如众包和劳工平台,以及集群计算系统等。我们研究的是基于加权比例公平分配标准的日程安排算法,以奖励最大化的边际成本为基础,同时用线性土匪式的算法计算工作任务分配奖赏额。对于一个基准设置,即工作具有相同的平均服务时间间隔,我们算法的亚线性偏差,以及在时间跨线性排队长度的子线性线性排列问题。我们发现,在更一般的系统假设下也有相似的界限,在比较公平的分配中,让我们的递定的递定的机级的递定值的递算算值的日历中, 也显示着一个比值的日历值的日历值的日历值的日历值的日历值的日历值的日历值的等级,让我们在不同的计算。