Reinforcement Learning (RL) has presented an impressive performance in video games through raw pixel imaging and continuous control tasks. However, RL performs poorly with high-dimensional observations such as raw pixel images. It is generally accepted that physical state-based RL policies such as laser sensor measurements give a more sample-efficient result than learning by pixels. This work presents a new approach that extracts information from a depth map estimation to teach an RL agent to perform the mapless navigation of Unmanned Aerial Vehicle (UAV). We propose the Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning(Depth-CUPRL) that estimates the depth of images with a prioritized replay memory. We used a combination of RL and Contrastive Learning to lead with the problem of RL based on images. From the analysis of the results with Unmanned Aerial Vehicles (UAVs), it is possible to conclude that our Depth-CUPRL approach is effective for the decision-making and outperforms state-of-the-art pixel-based approaches in the mapless navigation capability.
翻译:强化学习(RL)通过原始像素成像和连续控制任务在视频游戏中展示了令人印象深刻的性能。然而,RL在像素图像等高维观测方面表现不佳,人们普遍认为,基于物理状态的RL政策,例如激光传感器测量,比像素学习的结果更具样本效率。这项工作提出了一种新的方法,从深度地图估计中提取信息,教给一个RL代理对无人驾驶航空飞行器进行无地图导航。我们提议在强化学习(Dept-CUPR)中采用深度-ID-未经监督的优先显示,以优先重放内存来估计图像的深度。我们使用RL和相对性学习的组合,以图像为基础,导致RL问题。从与无人驾驶航空飞行器(UAVs)对结果的分析中可以得出这样的结论,即我们的深度-CUPRL方法对无地图导航能力中的决策和超出状态的像素基础方法是有效的。