Bistatic backscatter communication promises ubiquitous, massive connectivity by utilizing passive tags to connect with a reader by reflecting carrier emitter (CE) signals for future Internet-of-Things (IoT) networks. This study focuses on the joint design of the transmit/received beamformers at the CE/reader and the reflection coefficient of the tag. A throughput maximization problem is thus formulated, subject to satisfying the tag requirements. We develop a joint design through a series of trial-and-error interactions within the environment, driven by a predefined reward system in a continuous state and action context. We propose two deep reinforcement learning (DRL) algorithms to address the underlying optimization problem, namely deep deterministic policy gradient (DDPG) and soft actor-critic (SAC). Simulation results indicate that the proposed algorithm can learn from the environment and incrementally enhance its behavior, achieving performance that is on par with two leading benchmarks. Further, we also compared the performance of the proposed method with deep Q-network (DQN), double deep Q-network (DDQN), and dueling DQN (DuelDQN). For a system with twelve antennas, SAC leads with a 26.76% gain over DQN, followed by alternative optimization (AO) and DDPG at 23.02% and 19.16%. DDQN and DuelDQN show smaller improvements of 10.40% and 14.36%, respectively, against DQN.
翻译:暂无翻译