Designing effective routing strategies for mobile wireless networks is challenging due to the need to seamlessly adapt routing behavior to spatially diverse and temporally changing network conditions. In this work, we use deep reinforcement learning (DeepRL) to learn a scalable and generalizable single-copy routing strategy for such networks. We make the following contributions: i) we design a reward function that enables the DeepRL agent to explicitly trade-off competing network goals, such as minimizing delay vs. the number of transmissions per packet; ii) we propose a novel set of relational neighborhood, path, and context features to characterize mobile wireless networks and model device mobility independently of a specific network topology; and iii) we use a flexible training approach that allows us to combine data from all packets and devices into a single offline centralized training set to train a single DeepRL agent. To evaluate generalizeability and scalability, we train our DeepRL agent on one mobile network scenario and then test it on other mobile scenarios, varying the number of devices and transmission ranges. Our results show our learned single-copy routing strategy outperforms all other strategies in terms of delay except for the optimal strategy, even on scenarios on which the DeepRL agent was not trained.
翻译:在移动无线网络中设计有效的路由策略是具有挑战性的,因为必须无缝地适应地理上多样化和时间上变化的网络条件。在这项工作中,我们使用深度强化学习(DeepRL)来学习一个可扩展和通用的单副本路由策略。我们做出以下贡献:i)我们设计了一个奖励函数,使DeepRL代理可以明确地权衡竞争的网络目标,例如最小化延迟与每个数据包的传输次数;ii)我们提出了一组新颖的关系邻域、路径和上下文特征来描述移动无线网络,并独立地模拟设备移动性而不依赖特定的网络拓扑;iii)我们使用一种灵活的训练方法,允许我们将所有数据包和设备的数据合并成一个单一的离线集中训练集,以训练一个单一的DeepRL代理。为了评估通用性和可扩展性,我们在一个移动网络场景中训练了我们的DeepRL代理,然后在其他移动场景中测试它,改变设备的数量和传输范围。我们的结果表明,我们学习的单副本路由策略除了最优策略外,在延迟方面优于所有其他策略,甚至是在DeepRL代理没有接受训练的情况下的场景中。