An integration of satellites and terrestrial networks is crucial for enhancing performance of next generation communication systems. However, the networks are hindered by the long-distance path loss and security risks in dense urban environments. In this work, we propose a satellite-terrestrial covert communication system assisted by the aerial active simultaneous transmitting and reflecting reconfigurable intelligent surface (AASTAR-RIS) to improve the channel capacity while ensuring the transmission covertness. Specifically, we first derive the minimal detection error probability (DEP) under the worst condition that the Warden has perfect channel state information (CSI). Then, we formulate an AASTAR-RIS-assisted satellite-terrestrial covert communication optimization problem (ASCCOP) to maximize the sum of the fair channel capacity for all ground users while meeting the strict covert constraint, by jointly optimizing the trajectory and active beamforming of the AASTAR-RIS. Due to the challenges posed by the complex and high-dimensional state-action spaces as well as the need for efficient exploration in dynamic environments, we propose a generative deterministic policy gradient (GDPG) algorithm, which is a generative deep reinforcement learning (DRL) method to solve the ASCCOP. Concretely, the generative diffusion model (GDM) is utilized as the policy representation of the algorithm to enhance the exploration process by generating diverse and high-quality samples through a series of denoising steps. Moreover, we incorporate an action gradient mechanism to accomplish the policy improvement of the algorithm, which refines the better state-action pairs through the gradient ascent. Simulation results demonstrate that the proposed approach significantly outperforms important benchmarks.
翻译:星地网络融合对于提升下一代通信系统性能至关重要。然而,在密集城市环境中,远距离路径损耗和安全风险制约了网络性能。本文提出一种由空中主动式同时透射反射可重构智能表面(AASTAR-RIS)辅助的星地隐蔽通信系统,在确保传输隐蔽性的同时提升信道容量。具体而言,我们首先推导了在监视者具备完美信道状态信息(CSI)的最坏情况下的最小检测错误概率(DEP)。随后,通过联合优化AASTAR-RIS的轨迹与主动波束成形,构建了AASTAR-RIS辅助的星地隐蔽通信优化问题(ASCCOP),旨在满足严格隐蔽约束的前提下最大化所有地面用户的公平信道容量之和。针对动态环境中复杂高维状态动作空间带来的挑战以及高效探索的需求,我们提出一种生成式确定性策略梯度(GDPG)算法——这是一种基于生成式深度强化学习(DRL)的方法来求解ASCCOP。具体实现中,采用生成式扩散模型(GDM)作为算法的策略表征,通过多步去噪过程生成多样化的高质量样本以增强探索能力。此外,我们引入动作梯度机制实现算法的策略改进,通过梯度上升方法优化状态-动作对。仿真结果表明,所提方法在关键性能指标上显著优于现有基准方案。