We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely varying characteristics such as arrival rates and service times. This, coupled with the need for rapid network deployment, render a bottom up approach of first characterizing the traffic and then devising an appropriate scheduling protocol infeasible. In contrast, we formulate a top down approach to scheduling where, given an unknown network and a set of scheduling policies, we use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies. We derive convergence results and analyze finite time performance of the algorithm. Simulation results show that the algorithm performs well even when the arrival rates are nonstationary and can stabilize the system even when the constituent policies are unstable.
翻译:现代通信系统正变得越来越复杂,需要处理多种交通类型,其特点差异很大,如抵达率和服务时间等。这加上需要迅速部署网络,使得自下而上的方法无法首先确定交通特点,然后制定适当的日程安排程序。相反,我们制定了一个自上而下的日程安排方法,根据未知的网络和一套日程安排政策,我们使用基于政策梯度的强化学习算法,产生比现有原子政策更好的进度表。我们得出趋同结果,分析算法的有限时间性能。模拟结果显示算法即使在抵达率不固定的情况下也运行良好,即使在组成政策不稳定时也能稳定系统。