加强学习,促进高效和无资-无联系适应 (Reinforcement Learning for Efficient and Tuning-Free Link Adaptation)

Wireless links adapt the data transmission parameters to the dynamic channel state -- this is called link adaptation. Classical link adaptation relies on tuning parameters that are challenging to configure for optimal link performance. Recently, reinforcement learning has been proposed to automate link adaptation, where the transmission parameters are modeled as discrete arms of a multi-armed bandit. In this context, we propose a latent learning model for link adaptation that exploits the correlation between data transmission parameters. Further, motivated by the recent success of Thompson sampling for multi-armed bandit problems, we propose a latent Thompson sampling (LTS) algorithm that quickly learns the optimal parameters for a given channel state. We extend LTS to fading wireless channels through a tuning-free mechanism that automatically tracks the channel dynamics. In numerical evaluations with fading wireless channels, LTS improves the link throughout by up to 100% compared to the state-of-the-art link adaptation algorithms.

翻译：无线链接让数据传输参数适应动态频道状态 -- 这叫链接适应。古典链接适应依赖于调试参数, 而调试参数对配置最佳链接性能具有挑战性。最近, 强化学习被建议自动连接适应, 传输参数以多武装土匪的离散臂为模型。在这方面, 我们提出一个潜在学习模式, 用于利用数据传输参数之间的关联。此外, 由于最近汤普森抽样成功解决多武装土匪问题, 我们提议了一个潜伏的汤普森取样算法, 快速了解特定频道状态的最佳参数。我们通过自动跟踪频道动态的无调机制, 将LTS 扩展至无线通道的淡化机制。在对无线通道进行数字评估时, LTS 将整个链接改善到100%, 与最先进的相关调控算法相比。