Designing effective routing strategies for mobile wireless networks is challenging due to the need to seamlessly adapt routing behavior to spatially diverse and temporally changing network conditions. In this work, we use deep reinforcement learning (DeepRL) to learn a scalable and generalizable single-copy routing strategy for such networks. We make the following contributions: i) we design a reward function that enables the DeepRL agent to explicitly trade-off competing network goals, such as minimizing delay vs. the number of transmissions per packet; ii) we propose a novel set of relational neighborhood, path, and context features to characterize mobile wireless networks and model device mobility independently of a specific network topology; and iii) we use a flexible training approach that allows us to combine data from all packets and devices into a single offline centralized training set to train a single DeepRL agent. To evaluate generalizeability and scalability, we train our DeepRL agent on one mobile network scenario and then test it on other mobile scenarios, varying the number of devices and transmission ranges. Our results show our learned single-copy routing strategy outperforms all other strategies in terms of delay except for the optimal strategy, even on scenarios on which the DeepRL agent was not trained.
翻译:为移动无线网络设计有效路线战略是一项艰巨的任务,因为需要无缝地使路由行为适应空间多样性和时间变化的网络条件。在这项工作中,我们利用深强化学习(DeepRL)学习这种网络的可扩缩和通用的单一副本路由战略。我们做出以下贡献:(一)我们设计一个奖励功能,使深RL代理商能够明确权衡竞争网络目标,例如最大限度地减少延迟相对于每包传输的数量;(二)我们提出一套新型的关系邻里、路径和背景特征,以区分移动无线网络和模型装置的移动性,独立于特定的网络地形学;以及(三)我们使用灵活培训方法,将所有包装和设备的数据合并成单一离线集中培训组,以培训单一的深路段代理商。为了评估通用性和可扩缩性,我们用一个移动网络假设来培训我们的深路段代理商,然后测试其他移动情景,改变设备的数目和传输范围。我们的成果显示我们所学过的单谱深度路由战略,除了其他战略外,所有最优的延迟战略都是其他的。