无人机增强移动通信的多智能体强化学习与动作遮蔽 (Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled Mobile Communications) - 专知论文

会员服务 ·

0

无人机 · DQN · NOMA · 深度Q网络 · 智能体 ·

2023 年 3 月 29 日

Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled Mobile Communications

翻译：无人机增强移动通信的多智能体强化学习与动作遮蔽

Danish Rizvi,David Boyle

Unmanned Aerial Vehicles (UAVs) are increasingly used as aerial base stations to provide ad hoc communications infrastructure. Building upon prior research efforts which consider either static nodes, 2D trajectories or single UAV systems, this paper focuses on the use of multiple UAVs for providing wireless communication to mobile users in the absence of terrestrial communications infrastructure. In particular, we jointly optimize UAV 3D trajectory and NOMA power allocation to maximize system throughput. Firstly, a weighted K-means-based clustering algorithm establishes UAV-user associations at regular intervals. The efficacy of training a novel Shared Deep Q-Network (SDQN) with action masking is then explored. Unlike training each UAV separately using DQN, the SDQN reduces training time by using the experiences of multiple UAVs instead of a single agent. We also show that SDQN can be used to train a multi-agent system with differing action spaces. Simulation results confirm that: 1) training a shared DQN outperforms a conventional DQN in terms of maximum system throughput (+20%) and training time (-10%); 2) it can converge for agents with different action spaces, yielding a 9% increase in throughput compared to mutual learning algorithms; and 3) combining NOMA with an SDQN architecture enables the network to achieve a better sum rate compared with existing baseline schemes.

翻译：无人机(Unmanned Aerial Vehicles, UAVs) 被越来越广泛地用作提供临时通信基础设施的空中基站。本文在之前关于考虑静态节点、2D轨迹或单个无人机系统的研究基础上，专注于使用多个无人机为移动用户提供无地基通信基础设施。具体地，我们共同优化无人机的3D轨迹和非正交多址(NOMA)功率分配，以最大化系统吞吐量。首先，使用基于加权k均值聚类算法在定期间隔内建立无人机-用户关联。然后探索训练具有动作遮蔽的新型共享深度Q网络(Shared Deep Q-Network, SDQN)的有效性。与使用DQN分别训练每个无人机不同，SDQN通过使用多个智能体的经验而不是单个代理来减少训练时间。我们还展示了SDQN可以用于训练具有不同动作空间的多智能体系统。仿真结果证实：1）使用共享DQN训练在最大系统吞吐量（+20％）和训练时间（-10％）方面优于传统的DQN；2）它可以收敛于具有不同动作空间的智能体，相比相互学习算法，吞吐量增加了9%; 3）将NOMA与SDQN架构相结合，使网络与现有基准方案相比获得更好的总速率。

0

相关内容

无人机

不需要驾驶员登机驾驶的各式遥控飞行器。

【首本无人机UAVs硬核书】《通信、监视和交付无人机自主导航与部署》，275页pdf

【首本无人机UAVs硬核书】《通信、监视和交付无人机自主导航与部署》，275页pdf

专知会员服务

96+阅读 · 2022年9月13日

【“大量”智能体的强化学习】《Many-Agent Reinforcement Learning》，327页博士论文，伦敦大学学院（UCL）

【“大量”智能体的强化学习】《Many-Agent Reinforcement Learning》，327页博士论文，伦敦大学学院（UCL）

专知会员服务

118+阅读 · 2022年5月7日

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【AAAI2021】自校正Q学习，Self-correcting Q-Learning

专知会员服务

17+阅读 · 2020年12月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

7 Papers & Radios | 两栖机器龟登上Nature封面；深度去模糊综述论文入选IJCV

7 Papers & Radios | 两栖机器龟登上Nature封面；深度去模糊综述论文入选IJCV

机器之心

1+阅读 · 2022年10月16日

7 Papers & Radios | 无人机3D打印登Nature封面；哈工大用微波驱控机器人

7 Papers & Radios | 无人机3D打印登Nature封面；哈工大用微波驱控机器人

机器之心

0+阅读 · 2022年9月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

无线地下传感器网络电磁波在耕作层土壤的传输机理及模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

异构动态移动通信网络的延时优化

国家自然科学基金

2+阅读 · 2013年12月31日

无线传感器网络时空一致性方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式移动通信场景下的参数化信道建模及预测优化机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于步态运动模型的人员跟踪及定位算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多Agent的混杂交互传感器网络的群集扩散同步及优势聚集效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向工业自动化控制的传感器/执行器网络闭环控制回路及优化研究

国家自然科学基金

0+阅读 · 2009年12月31日

多智能体网络系统的一致性协调控制

国家自然科学基金

3+阅读 · 2009年12月31日

监控网络中人身份识别和行为理解的结合与增强

国家自然科学基金

1+阅读 · 2009年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

UAV-Enabled Integrated Sensing and Communication: Opportunities and Challenges

Arxiv

0+阅读 · 2023年5月19日

Reinforcement Learning for Legged Robots: Motion Imitation from Model-Based Optimal Control

Arxiv

0+阅读 · 2023年5月18日

Finite Time Lyapunov Exponent Analysis of Model Predictive Control and Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Intelligent multicast routing method based on multi-agent deep reinforcement learning in SDWN

Arxiv

0+阅读 · 2023年5月12日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

45+阅读 · 2022年8月2日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

【首本无人机UAVs硬核书】《通信、监视和交付无人机自主导航与部署》，275页pdf

【首本无人机UAVs硬核书】《通信、监视和交付无人机自主导航与部署》，275页pdf

专知会员服务

96+阅读 · 2022年9月13日

【“大量”智能体的强化学习】《Many-Agent Reinforcement Learning》，327页博士论文，伦敦大学学院（UCL）

【“大量”智能体的强化学习】《Many-Agent Reinforcement Learning》，327页博士论文，伦敦大学学院（UCL）

专知会员服务

118+阅读 · 2022年5月7日

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【AAAI2021】自校正Q学习，Self-correcting Q-Learning

专知会员服务

17+阅读 · 2020年12月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

7 Papers & Radios | 两栖机器龟登上Nature封面；深度去模糊综述论文入选IJCV

7 Papers & Radios | 两栖机器龟登上Nature封面；深度去模糊综述论文入选IJCV

机器之心

1+阅读 · 2022年10月16日

7 Papers & Radios | 无人机3D打印登Nature封面；哈工大用微波驱控机器人

7 Papers & Radios | 无人机3D打印登Nature封面；哈工大用微波驱控机器人

机器之心

0+阅读 · 2022年9月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

相关论文

UAV-Enabled Integrated Sensing and Communication: Opportunities and Challenges

Arxiv

0+阅读 · 2023年5月19日

Reinforcement Learning for Legged Robots: Motion Imitation from Model-Based Optimal Control

Arxiv

0+阅读 · 2023年5月18日

Finite Time Lyapunov Exponent Analysis of Model Predictive Control and Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Intelligent multicast routing method based on multi-agent deep reinforcement learning in SDWN

Arxiv

0+阅读 · 2023年5月12日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

45+阅读 · 2022年8月2日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

无线地下传感器网络电磁波在耕作层土壤的传输机理及模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

异构动态移动通信网络的延时优化

国家自然科学基金

2+阅读 · 2013年12月31日

无线传感器网络时空一致性方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式移动通信场景下的参数化信道建模及预测优化机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于步态运动模型的人员跟踪及定位算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多Agent的混杂交互传感器网络的群集扩散同步及优势聚集效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向工业自动化控制的传感器/执行器网络闭环控制回路及优化研究

国家自然科学基金

0+阅读 · 2009年12月31日

多智能体网络系统的一致性协调控制

国家自然科学基金

3+阅读 · 2009年12月31日

监控网络中人身份识别和行为理解的结合与增强

国家自然科学基金

1+阅读 · 2009年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员