We show in this work that reinforcement learning can be successfully applied to decoding short to moderate length sparse graph-based channel codes. Specifically, we focus on low-density parity check (LDPC) codes, which for example have been standardized in the context of 5G cellular communication systems due to their excellent error correcting performance. These codes are typically decoded via belief propagation iterative decoding on the corresponding bipartite (Tanner) graph of the code via flooding, i.e., all check and variable nodes in the Tanner graph are updated at once. In contrast, in this paper we utilize a sequential update policy which selects the optimum check node (CN) scheduling in order to improve decoding performance. In particular, we model the CN update process as a multi-armed bandit process with dependent arms and employ a Q-learning scheme for optimizing the CN scheduling policy. In order to reduce the learning complexity, we propose a novel graph-induced CN clustering approach to partition the state space in such a way that dependencies between clusters are minimized. Our results show that compared to other decoding approaches from the literature, the proposed reinforcement learning scheme not only significantly improves the decoding performance, but also reduces the decoding complexity dramatically once the model is learned.
翻译:我们在这项工作中显示,强化学习可以成功地用于解码短到中长的、以图表为基础的频道代码。具体地说,我们侧重于低密度对等检查(LDPC)代码(LDPC),例如,由于5G细胞通信系统极差的纠正性能,这些代码在5G细胞通信系统中实现了标准化。这些代码通常通过信仰通过洪水对代码的相应双部图(Tanner)进行迭代解码,即Tanner图中的所有检查和可变节点都同时更新。相比之下,我们在本文件中采用了顺序更新政策,选择了最佳检查节点(CN)的时间安排,以改进解码性能。特别是,我们将CN更新进程建模成一个多臂手手制的手势,并采用Q学习计划优化CN的排期政策。为了降低学习的复杂性,我们建议采用新的图形诱导的氯化萘组合方法来分隔州间空间,以便各组之间相互依存。我们的成果显示,与其他解码方法相比,与文献中的其他解码方法相比,增强复杂性的学习计划不会大大改进。