动态IOT网络中的挤压-内存路标:强化学习方法 (Congestion-Aware Routing in Dynamic IoT Networks: A Reinforcement Learning Approach)

The innovative services empowered by the Internet of Things (IoT) require a seamless and reliable wireless infrastructure that enables communications within heterogeneous and dynamic low-power and lossy networks (LLNs). The Routing Protocol for LLNs (RPL) was designed to meet the communication requirements of a wide range of IoT application domains. However, a load balancing problem exists in RPL under heavy traffic-load scenarios, degrading the network performance in terms of delay and packet delivery. In this paper, we tackle the problem of load-balancing in RPL networks using a reinforcement-learning framework. The proposed method adopts Q-learning at each node to learn an optimal parent selection policy based on the dynamic network conditions. Each node maintains the routing information of its neighbours as Q-values that represent a composite routing cost as a function of the congestion level, the link-quality and the hop-distance. The Q-values are updated continuously exploiting the existing RPL signalling mechanism. The performance of the proposed approach is evaluated through extensive simulations and compared with the existing work to demonstrate its effectiveness. The results show that the proposed method substantially improves network performance in terms of packet delivery and average delay with a marginal increase in the signalling frequency.

翻译：互联网“物联网”所赋予的创新服务要求有一个无缝和可靠的无线基础设施,使不同而动态的低功率和亏损网络(LLNs)内部的通信成为无缝和可靠的无线基础设施。LLNs的运行协议旨在满足广泛的IOT应用领域的通信要求。然而,在沉重的交通负荷情景下,在RPL存在一个负载平衡问题,降低了网络在延误和包装交付方面的性能。在本文件中,我们利用强化学习框架解决了RPL网络的负载平衡问题。提议的方法在每一个节点采用Q学习,学习基于动态网络条件的最佳母体选择政策。每个节点保持其邻居的路径信息作为Q值,作为连接水平、连接质量和跳距离的一种功能,代表了综合线路运行成本。Q值不断更新利用现有的RPL信号传输机制。拟议方法的绩效通过广泛的模拟与现有工作进行比较,以展示其有效性。每个节点都采用Q学习方法,以动态网络条件为基础,学习了最佳母体选择政策。每个节点维护其路径,作为Q值,作为Q-价值,作为综合路路路路路段的功能,作为连接功能的功能的功能将大大提高网络交付率。拟议方法在交付率。