老年敏感移动边缘计算所联邦多机构行为者-消费者学习 (Federated Multi-Agent Actor-Critic Learning for Age Sensitive Mobile Edge Computing)

As an emerging technique, mobile edge computing (MEC) introduces a new processing scheme for various distributed communication-computing systems such as industrial Internet of Things (IoT), vehicular communication, smart city, etc. In this work, we mainly focus on the timeliness of the MEC systems where the freshness of the data and computation tasks is significant. Firstly, we formulate a kind of age-sensitive MEC models and define the average age of information (AoI) minimization problems of interests. Then, a novel policy based multi-agent deep reinforcement learning (RL) framework, called heterogeneous multi-agent actor critic (H-MAAC), is proposed as a paradigm for joint collaboration in the investigated MEC systems, where edge devices and center controller learn the interactive strategies through their own observations. To improves the system performance, we develop the corresponding online algorithm by introducing an edge federated learning mode into the multi-agent cooperation whose advantages on learning convergence can be guaranteed theoretically. To the best of our knowledge, it's the first joint MEC collaboration algorithm that combines the edge federated mode with the multi-agent actor-critic reinforcement learning. Furthermore, we evaluate the proposed approach and compare it with classical RL based methods. As a result, the proposed framework not only outperforms the baseline on average system age, but also promotes the stability of training process. Besides, the simulation results provide some innovative perspectives for the system design under the edge federated collaboration.

翻译：作为一项新兴技术,移动边缘计算(MEC)为各种分布式通信计算系统(如:工业物互联网(IoT)、车辆通信、智能城市等)引入了新的处理机制。在这项工作中,我们主要侧重于数据和计算任务新颖性显著的MEC系统的及时性。首先,我们开发一种对年龄敏感的MEC模型,并确定信息的平均年龄(AoI),最大限度地减少兴趣问题。然后,提出了一个新的基于多试深层强化学习(RL)框架的新政策,称为混合多剂演员评论(H-MAAC),作为在所调查的MEC系统中进行联合协作的范例,在该系统中,边缘装置和中心控制器通过自己的观测学习互动战略。为了改进系统性能,我们开发了相应的在线算法,在多剂合作中引入了一种对年龄有敏感认识的学习模式,从理论上可以保证对学习趋同的好处。根据我们的知识,这是MEC的第一个联合计算法,将边端配制模式与多剂演员-演员强化方法相结合,而不是在模型设计系统下,我们建议的一种基于传统-格式分析结果的计算方法,我们还评估了一种基于传统-格式的计算方法。此外,我们还评估了一种基于传统-年龄分析方法的计算方法,我们建议了一种方法,我们建议了一种评价了某种方法,而不是比较了一种基于基础的计算结果学习。我们建议了一种基础的计算方法。