In this study, reinforcement learning was applied to learning two-dimensional path planning including obstacle avoidance by unmanned aerial vehicle (UAV) in an indoor environment. The task assigned to the UAV was to reach the goal position in the shortest amount of time without colliding with any obstacles. Reinforcement learning was performed in a virtual environment created using Gazebo, a virtual environment simulator, to reduce the learning time and cost. Curriculum learning, which consists of two stages was performed for more efficient learning. As a result of learning with two reward models, the maximum goal rates achieved were 71.2% and 88.0%.
翻译:在这项研究中,强化学习用于学习二维路径规划,包括无人驾驶飞行器在室内环境中避免障碍,无人驾驶飞行器的任务是在最短的时间内达到目标位置,而不与任何障碍相冲突,强化学习是在虚拟环境中进行的,利用虚拟环境模拟器Gazebo来减少学习时间和费用,课程学习包括两个阶段,目的是提高学习效率,通过两种奖励模式学习,实现的最高目标率为71.2%和88.0%。