With the development of industry, drones are appearing in various field. In recent years, deep reinforcement learning has made impressive gains in games, and we are committed to applying deep reinforcement learning algorithms to the field of robotics, moving reinforcement learning algorithms from game scenarios to real-world application scenarios. We are inspired by the LunarLander of OpenAI Gym, we decided to make a bold attempt in the field of reinforcement learning to control drones. At present, there is still a lack of work applying reinforcement learning algorithms to robot control, the physical simulation platform related to robot control is only suitable for the verification of classical algorithms, and is not suitable for accessing reinforcement learning algorithms for the training. In this paper, we will face this problem, bridging the gap between physical simulation platforms and intelligent agent, connecting intelligent agents to a physical simulation platform, allowing agents to learn and complete drone flight tasks in a simulator that approximates the real world. We proposed a reinforcement learning framework based on Gazebo that is a kind of physical simulation platform (ROS-RL), and used three continuous action space reinforcement learning algorithms in the framework to dealing with the problem of autonomous landing of drones. Experiments show the effectiveness of the algorithm, the task of autonomous landing of drones based on reinforcement learning achieved full success.
翻译:随着产业的发展,无人驾驶飞机正在各种领域出现。近年来,深度强化学习在游戏领域取得了令人印象深刻的成绩。近年来,深度强化学习在游戏中取得了令人印象深刻的成绩,我们致力于在机器人领域应用深度强化学习算法,将强化学习算法从游戏场景转向现实世界应用。我们受到OpenAI Gym的Lunarlandander的启发,我们决定在强化学习领域大胆尝试控制无人驾驶飞机。目前,仍然缺乏将强化学习算法应用于机器人控制的强化算法,与机器人控制有关的物理模拟平台仅适合验证经典算法,而不适合获取强化学习算法用于培训。在本文件中,我们将面对这一问题,弥合物理模拟平台和智能代理商之间的差距,将智能代理商与物理模拟平台联系起来,让代理商在接近现实世界的模拟器中学习和完成无人驾驶飞机飞行任务。我们提议了一个基于Gazebo的强化学习框架,这是一种物理模拟平台(ROS-RL),并使用三种持续操作空间强化空间学习算法,用于应对自动模拟平台的自动着陆的自动着陆问题,以展示自动飞行器的自动着陆成功飞行的成功。