We consider the problem of autonomous channel access (AutoCA), where a group of terminals tries to discover a communication strategy with an access point (AP) via a common wireless channel in a distributed fashion. Due to the irregular topology and the limited communication range of terminals, a practical challenge for AutoCA is the hidden terminal problem, which is notorious in wireless networks for deteriorating the throughput and delay performances. To meet the challenge, this paper presents a new multi-agent deep reinforcement learning paradigm, dubbed MADRL-HT, tailored for AutoCA in the presence of hidden terminals. MADRL-HT exploits topological insights and transforms the observation space of each terminal into a scalable form independent of the number of terminals. To compensate for the partial observability, we put forth a look-back mechanism such that the terminals can infer behaviors of their hidden terminals from the carrier sensed channel states as well as feedback from the AP. A window-based global reward function is proposed, whereby the terminals are instructed to maximize the system throughput while balancing the terminals' transmission opportunities over the course of learning. Extensive numerical experiments verified the superior performance of our solution benchmarked against the legacy carrier-sense multiple access with collision avoidance (CSMA/CA) protocol.
翻译:我们考虑了自主通道接入问题(AutoCA),一组终端试图通过分布式的通用无线频道发现有接入点(AP)的通信战略。由于不规则的地形学和有限的终端通信范围,AutoCA面临的一个实际挑战是隐藏的终端问题,在无线网络中,无线网络中臭名昭著,造成输送量恶化和延迟性能。为了迎接这一挑战,本文件提出了一个新的多试深层强化学习模式,称为MADRL-HT,由AutoCA在隐藏终端面前专门设计。MADRL-HT利用地貌洞见,将每个终端的观测空间转换成可伸缩的形式,独立于终端数量。为了弥补部分可观测性,我们提出了一个回顾机制,使终端能够从承运人感应频道国家推断出其隐藏终端的行为以及AP的反馈。提议了一个以窗口为基础的全球奖励功能,根据这一功能指示终端在平衡终端传输机会的同时,与多度轨道传输机会的同时,与我们的安全舱位数级标准相比,平衡。