基于学习的UAV 轨迹优化与碰撞避免和连通制约 (Learning-Based UAV Trajectory Optimization with Collision Avoidance and Connectivity Constraints)

Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks, and determining collision-free trajectories for multiple UAVs while satisfying requirements of connectivity with ground base stations (GBSs) is a challenging task. In this paper, we first reformulate the multi-UAV trajectory optimization problem with collision avoidance and wireless connectivity constraints as a sequential decision making problem in the discrete time domain. We, then, propose a decentralized deep reinforcement learning approach to solve the problem. More specifically, a value network is developed to encode the expected time to destination given the agent's joint state (including the agent's information, the nearby agents' observable information, and the locations of the nearby GBSs). A signal-to-interference-plus-noise ratio (SINR)-prediction neural network is also designed, using accumulated SINR measurements obtained when interacting with the cellular network, to map the GBSs' locations into the SINR levels in order to predict the UAV's SINR. Numerical results show that with the value network and SINR-prediction network, real-time navigation for multi-UAVs can be efficiently performed in various environments with high success rate.

翻译：无人驾驶航空飞行器(无人驾驶飞行器)预计将成为无线网络的一个组成部分,确定多架无人驾驶航空器的无碰撞轨迹,同时满足与地面基地站连接的要求是一项艰巨的任务。在本文件中,我们首先将多架无人驾驶航空器轨道优化问题与避免碰撞和无线连接限制重塑为离散时间范围内的连续决策问题。然后,我们提出一种分散式的深层强化学习方法来解决问题。更具体地说,开发了一个价值网络,以根据该代理人的共同状态(包括该代理人的信息、附近代理人的可观测信息以及附近GBS的所在地),将预期目的地的时间编码起来。还设计了一个信号到干涉加音频比率(SINR)-定位神经网络,利用在与手机网络互动时获得的累计SINR测量结果,将GBS的定位点绘制到SINR的级别,以便预测UAV的SINR。数字结果显示,随着价值网络和SINR的定位网络的运行,在各种高成功环境中,可有效运行多式导航率。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

专知会员服务

39+阅读 · 2020年11月3日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

深度强化学习策略梯度教程，53页ppt