This paper focuses on improving the resource allocation algorithm in terms of packet delivery ratio (PDR), i.e., the number of successfully received packets sent by end devices (EDs) in a long-range wide-area network (LoRaWAN). Setting the transmission parameters significantly affects the PDR. Employing reinforcement learning (RL), we propose a resource allocation algorithm that enables the EDs to configure their transmission parameters in a distributed manner. We model the resource allocation problem as a multi-armed bandit (MAB) and then address it by proposing a two-phase algorithm named MIX-MAB, which consists of the exponential weights for exploration and exploitation (EXP3) and successive elimination (SE) algorithms. We evaluate the MIX-MAB performance through simulation results and compare it with other existing approaches. Numerical results show that the proposed solution performs better than the existing schemes in terms of convergence time and PDR.
翻译:本文侧重于改进在包件交付率方面的资源分配算法,即通过远程广域网(LORAWAN)通过终端装置发送的包裹成功接收的数量。设置传输参数对PDR产生重大影响。我们利用强化学习(RL),建议一种资源分配算法,使EDs能够以分布方式配置其传输参数。我们以多武装土匪(MAB)的形式将资源分配问题模型化,然后通过提出一个名为MIX-MAB的两阶段算法来解决,该算法由用于勘探和开发的指数重量(EXP3)和连续消除(SE)算法组成。我们通过模拟结果评估MIX-MAB的性能,并将它与其他现有方法进行比较。数字结果显示,拟议的解决办法在趋同时间和PDR方面比现有的方案要好。