Autonomous navigation in unknown complex environment is still a hard problem, especially for small Unmanned Aerial Vehicles (UAVs) with limited computation resources. In this paper, a neural network-based reactive controller is proposed for a quadrotor to fly autonomously in unknown outdoor environment. The navigation controller makes use of only current sensor data to generate the control signal without any optimization or configuration space searching, which reduces both memory and computation requirement. The navigation problem is modelled as a Markov Decision Process (MDP) and solved using deep reinforcement learning (DRL) method. Specifically, to get better understanding of the trained network, some model explanation methods are proposed. Based on the feature attribution, each decision making result during flight is explained using both visual and texture explanation. Moreover, some global analysis are also provided for experts to evaluate and improve the trained neural network. The simulation results illustrated the proposed method can make useful and reasonable explanation for the trained model, which is beneficial for both non-expert users and controller designer. Finally, the real world tests shown the proposed controller can navigate the quadrotor to goal position successfully and the reactive controller performs much faster than some conventional approach under the same computation resource.
翻译:未知复杂环境中的自主导航仍然是一个棘手问题,对于计算资源有限的小型无人驾驶飞行器来说尤其如此。本文建议为在未知室外环境中自主飞行的二次钻探器设置一个神经网络反应控制器。导航控制器仅使用当前传感器数据来生成控制信号,而不作任何优化或配置空间搜索,从而减少内存和计算要求。导航问题仿照了Markov决策程序(MDP),并使用深层加固学习(DRL)方法加以解决。具体来说,为了更好地了解受过训练的网络,提出了一些示范解释方法。根据特征归属,对飞行期间的每一项决策结果都使用视觉和纹理解释加以解释。此外,还为专家提供了一些全球分析,以评价和改进经过训练的神经网络。模拟结果说明拟议方法可以为经过训练的模型提供有用和合理的解释,这对非专家用户和主计长都有好处。最后,真实的世界测试表明,拟议的控制器可以成功地引导二次钻探者进入目标位置,而反应控制器在相同的计算资源下,其速度比常规方法要快得多。