5G and beyond networks need to provide dynamic and efficient infrastructure management to better adapt to time-varying user behaviors (e.g., user mobility, interference, user traffic and evolution of the network topology). In this paper, we propose to manage the trajectory of Mobile Access Points (MAPs) under all these dynamic constraints with reduced complexity. We first formulate the placement problem to manage MAPs over time. Our solution addresses time-varying user traffic and user mobility through a Multi-Agent Deep Reinforcement Learning (MADRL). To achieve real-time behavior, the proposed solution learns to perform distributed assignment of MAP-user positions and schedules the MAP path among all users without centralized user's clustering feedback. Our solution exploits a dual-attention MADRL model via proximal policy optimization to dynamically move MAPs in 3D. The dual-attention takes into account information from both users and MAPs. The cooperation mechanism of our solution allows to manage different scenarios, without a priory information and without re-training, which significantly reduces complexity.
翻译:5G网络内外的网络需要提供动态和有效的基础设施管理,以更好地适应时间变化的用户行为(例如用户流动性、干扰、用户流量和网络地形的演变)。在本文件中,我们提议在所有这些动态限制下管理移动接入点(MAPs)的轨迹,其复杂性降低。我们首先提出设置问题,以便管理MAPs。我们的解决方案通过多代理深度强化学习(MADRL)解决时间变化的用户流量和用户流动性问题。为了实现实时行为,拟议解决方案学会在不集中用户集群反馈的情况下,在所有用户中进行分布式的MAP用户职位分配,并安排MAP路径。我们的解决办法利用了双关注的MADRL模式,通过准政策优化将MAPs动态地移动到3D。双关注考虑到用户和MAPs提供的信息。我们解决方案的合作机制允许在没有事先信息和再培训的情况下管理不同情景,从而大大降低了复杂性。</s>