Reinforcement Learning (RL) has recently found wide applications in network traffic management and control because some of its variants do not require prior knowledge of network models. In this paper, we present a novel scheduler for real-time multimedia delivery in multipath systems based on an Actor-Critic (AC) RL algorithm. We focus on a challenging scenario of real-time video streaming from an Unmanned Aerial Vehicle (UAV) using multiple wireless paths. The scheduler acting as an RL agent learns in real-time the optimal policy for path selection, path rate allocation and redundancy estimation for flow protection. The scheduler, implemented as a module of the GStreamer framework, can be used in real or simulated settings. The simulation results show that our scheduler can target a very low loss rate at the receiver by dynamically adapting in real-time the scheduling policy to the path conditions without performing training or relying on prior knowledge of network channel models.
翻译:强化学习(RL)最近发现网络交通管理和控制应用范围很广,因为有些变种不需要事先了解网络模型。在本文中,我们根据Actor-Critic(AC) RL算法为多路系统实时多媒体传输提供了一个新的调度器。我们侧重于使用多条无线路径从无人驾驶航空飞行器(UAV)实时视频流出这一具有挑战性的情景。作为RL代理的调度器实时学习选择路径的最佳政策、路径比率分配和流量保护的冗余估计。作为GStreamer框架模块实施的调度器可用于真实或模拟环境。模拟结果表明,我们的调度器可以在不进行培训或不依赖先前对网络频道模型的了解的情况下,通过动态实时调整调度政策以适应路径条件,针对接收器的极低损失率。