In general, reliable communication via multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) requires accurate channel estimation at the receiver. The existing literature largely focuses on denoising methods for channel estimation that depend on either (i)~channel analysis in the time-domain with prior channel knowledge or (ii)~supervised learning techniques which require large pre-labeled datasets for training. To address these limitations, we present a frequency-domain denoising method based on a reinforcement learning framework that does not need a priori channel knowledge and pre-labeled data. Our methodology includes a new successive channel denoising process based on channel curvature computation, for which we obtain a channel curvature magnitude threshold to identify unreliable channel estimates. Based on this process, we formulate the denoising mechanism as a Markov decision process, where we define the actions through a geometry-based channel estimation update, and the reward function based on a policy that reduces mean squared error (MSE). We then resort to Q-learning to update the channel estimates. Numerical results verify that our denoising algorithm can successfully mitigate noise in channel estimates. In particular, our algorithm provides a significant improvement over the practical least squares (LS) estimation method and provides performance that approaches that of the ideal linear minimum mean square error (LMMSE) estimation with perfect knowledge of channel statistics.
翻译:一般来说,通过多输入多输出(MIMO)或多输出(OFDM)的频率分解多重值(OFDM)的可靠通信要求接收器进行准确的频道估算。现有文献主要侧重于频道估算的分解方法,这些方法取决于:(一) 使用先前频道知识的时空外观分析或(二) 受监督的学习技术,这些技术需要大量的预贴标签的培训数据集。为了解决这些限制,我们提出了一个基于强化学习框架、不需要先导频道知识和预贴标签的数据的频率保持取消方法。我们的方法包括基于频道曲线计算的新连续频道分解过程,为此我们获得了频道曲线质量临界值阈值,以确定不可靠的频道估算值。我们根据这一过程,将脱钩机制设计成一个Markov决策程序,我们通过基于几何测量的频道估算更新来定义行动,以及基于减少中度误差的政策的奖励功能(MSE)。我们随后可以学习如何更新频道的中值估算值。Nuerical结果使我们的精确度数据比平方算法的平方算法得到最差的验证。