A reinforcement-learning-based non-uniform compressed sensing (NCS) framework for time-varying signals is introduced. The proposed scheme, referred to as RL-NCS, aims to boost the performance of signal recovery through an optimal and adaptive distribution of sensing energy among two groups of coefficients of the signal, referred to as the region of interest (ROI) coefficients and non-ROI coefficients. The coefficients in ROI usually have greater importance and need to be reconstructed with higher accuracy compared to non-ROI coefficients. In order to accomplish this task, the ROI is predicted at each time step using two specific approaches. One of these approaches incorporates a long short-term memory (LSTM) network for the prediction. The other approach employs the previous ROI information for predicting the next step ROI. Using the exploration-exploitation technique, a Q-network learns to choose the best approach for designing the measurement matrix. Furthermore, a joint loss function is introduced for the efficient training of the Q-network as well as the LSTM network. The result indicates a significant performance gain for our proposed method, even for rapidly varying signals and a reduced number of measurements.
翻译:为时间变化信号引入了一个基于强化学习的非统一压缩感测框架(NCS),称为RL-NCS,拟议计划的目的是通过在信号的两组系数(称为利益区域系数和非ROI系数)之间优化和适应性地分配遥感能量,提高信号恢复的性能。ROI中的系数通常比非ROI系数更为重要,需要以更高的精确度加以重建。为了完成这项任务,每次步骤都使用两种具体方法预测ROI。其中一种方法包括长期短期内存(LSTM)网络进行预测。另一种方法利用原先ROI信息预测下一步ROI。利用勘探-开发技术,Q-网络学会选择设计测量矩阵的最佳方法。此外,为了高效率地培训Q-网络和LSTM网络,引入了联合损失功能。结果表明,我们拟议的方法取得了显著的业绩收益,即使是在快速变化的信号和减少的测量数量方面也是如此。