We review the role of information and learning in the stability and optimization of queueing systems. In recent years, techniques from supervised learning, bandit learning and reinforcement learning have been applied to queueing systems supported by increasing role of information in decision making. We present observations and new results that help rationalize the application of these areas to queueing systems. We prove that the MaxWeight and BackPressure policies are an application of Blackwell's Approachability Theorem. This connects queueing theoretic results with adversarial learning. We then discuss the requirements of statistical learning for service parameter estimation. As an example, we show how queue size regret can be bounded when applying a perceptron algorithm to classify service. Next, we discuss the role of state information in improved decision making. Here we contrast the roles of epistemic information (information on uncertain parameters) and aleatoric information (information on an uncertain state). Finally we review recent advances in the theory of reinforcement learning and queueing, as well as, provide discussion on current research challenges.
翻译:我们审查了信息和学习在排队系统稳定性和优化方面的作用。近年来,通过监督学习、土匪学习和强化学习等技术应用到排队系统,并增加了信息在决策中的作用。我们提出了有助于将这些领域应用到排队系统的观测和新结果,帮助将这些领域应用到排队系统合理化。我们证明, MaxWeight 和 BackPressure 政策是应用Blackwell的可接近性理论。 将排队理论结果与对立学习联系起来。 然后,我们讨论了服务参数估算的统计学习要求。举例来说,我们展示了在应用 perceptron 算法进行分类服务时,排队人数的遗憾是如何被束缚的。接下来,我们讨论了国家信息在改进决策中的作用。我们在这里对集中信息(关于不确定参数的信息)和分类信息(关于不确定状态的信息)的作用进行了对比。最后,我们审查了加强学习和排队列理论的最新进展,并就当前的研究挑战提供了讨论。