Deep Reinforcement Learning is quickly becoming a popular method for training autonomous Unmanned Aerial Vehicles (UAVs). Our work analyzes the effects of measurement uncertainty on the performance of Deep Reinforcement Learning (DRL) based waypoint navigation and obstacle avoidance for UAVs. Measurement uncertainty originates from noise in the sensors used for localization and detecting obstacles. Measurement uncertainty/noise is considered to follow a Gaussian probability distribution with unknown non-zero mean and variance. We evaluate the performance of a DRL agent trained using the Proximal Policy Optimization (PPO) algorithm in an environment with continuous state and action spaces. The environment is randomized with different numbers of obstacles for each simulation episode in the presence of varying degrees of noise, to capture the effects of realistic sensor measurements. Denoising techniques like the low pass filter and Kalman filter improve performance in the presence of unbiased noise. Moreover, we show that artificially injecting noise into the measurements during evaluation actually improves performance in certain scenarios. Extensive training and testing of the DRL agent under various UAV navigation scenarios are performed in the PyBullet physics simulator. To evaluate the practical validity of our method, we port the policy trained in simulation onto a real UAV without any further modifications and verify the results in a real-world environment.
翻译:深度强化学习正在迅速成为培训自主无人驾驶航空飞行器的流行方法。我们的工作分析测量不确定性对深强化学习(DRL)路口导航和避免无人驾驶飞行器障碍的性能的影响。测量不确定性源于用于定位和探测障碍的传感器的噪音。测量不确定性/噪音被认为遵循高斯概率分布法,且不为零平均和差异未知。我们评估了使用最佳政策优化算法培训的DRL代理商在连续状态和行动空间环境中的性能。环境随机化,每个模拟事件都遇到不同程度的障碍,以捕捉现实传感器测量的效果。低过关过滤器和卡尔曼过滤器等不明显技术提高了无偏心噪音的性能。此外,我们表明在评估期间人工将噪音注入测量结果确实改善某些情景的性能。在各种UAVL导航情景下对DL代理商进行了广泛的培训和测试。在PyBul物理物理模拟器中,在不经过实际核查的情况下,对实际环境进行了任何实际核查。</s>