We study the problem of user association, namely finding the optimal assignment of user equipment to base stations to achieve a targeted network performance. In this paper, we focus on the knowledge transferability of association policies. Indeed, traditional non-trivial user association schemes are often scenario-specific or deployment-specific and require a policy re-design or re-learning when the number or the position of the users change. In contrast, transferability allows to apply a single user association policy, devised for a specific scenario, to other distinct user deployments, without needing a substantial re-learning or re-design phase and considerably reducing its computational and management complexity. To achieve transferability, we first cast user association as a multi-agent reinforcement learning problem. Then, based on a neural attention mechanism that we specifically conceived for this context, we propose a novel distributed policy network architecture, which is transferable among users with zero-shot generalization capability i.e., without requiring additional training.Numerical results show the effectiveness of our solution in terms of overall network communication rate, outperforming centralized benchmarks even when the number of users doubles with respect to the initial training point.
 翻译:我们研究用户联系问题,即找到最佳地将用户设备分配给基地站,以实现有针对性的网络绩效。在本文件中,我们侧重于联系政策的知识转让性。事实上,传统的非三联用户联系计划往往针对具体情况或部署,在用户数目或位置发生变化时需要重新设计或重新学习政策。相比之下,可转让性允许将为特定情景设计的单一用户联系政策应用于其他独特的用户部署,而不需要大量再学习或重新设计阶段,并大大降低其计算和管理复杂性。为了实现可转让性,我们首先将用户联系作为一个多试剂强化学习问题。然后,根据我们专门为这一背景设计的神经关注机制,我们提出一个新的分布式政策网络结构,在拥有零发通用能力的用户之间可转让,即无需额外培训。Numerical结果表明,我们解决方案在整体网络通信率方面的有效性,即使用户人数在初始培训点上翻倍,也超过了集中基准。