The dangers of adversarial attacks on Uncrewed Aerial Vehicle (UAV) agents operating in public are increasing. Adopting AI-based techniques and, more specifically, Deep Learning (DL) approaches to control and guide these UAVs can be beneficial in terms of performance but can add concerns regarding the safety of those techniques and their vulnerability against adversarial attacks. Confusion in the agent's decision-making process caused by these attacks can seriously affect the safety of the UAV. This paper proposes an innovative approach based on the explainability of DL methods to build an efficient detector that will protect these DL schemes and the UAVs adopting them from attacks. The agent adopts a Deep Reinforcement Learning (DRL) scheme for guidance and planning. The agent is trained with a Deep Deterministic Policy Gradient (DDPG) with Prioritised Experience Replay (PER) DRL scheme that utilises Artificial Potential Field (APF) to improve training times and obstacle avoidance performance. A simulated environment for UAV explainable DRL-based planning and guidance, including obstacles and adversarial attacks, is built. The adversarial attacks are generated by the Basic Iterative Method (BIM) algorithm and reduced obstacle course completion rates from 97\% to 35\%. Two adversarial attack detectors are proposed to counter this reduction. The first one is a Convolutional Neural Network Adversarial Detector (CNN-AD), which achieves accuracy in the detection of 80\%. The second detector utilises a Long Short Term Memory (LSTM) network. It achieves an accuracy of 91\% with faster computing times compared to the CNN-AD, allowing for real-time adversarial detection.
翻译:随着对公共地区内操作的无人机代理受到对抗性攻击的危险不断增加,采用基于人工智能和深度学习方法控制和引导这些无人机在性能方面可能会带来好处,但会增加这些技术的安全性疑虑和其对对抗性攻击的易感性。这些攻击引起的代理决策过程混淆可能会严重影响无人机的安全。本文提出了一种创新的方法,基于可解释深度学习的方法构建有效的探测器,保护这些使用DL方案和采用它们的无人机免受攻击。代理采用基于深度强化学习的方案进行导航和规划。该代理使用了Deep Deterministic Policy Gradient (DDPG) with Prioritised Experience Replay (PER) DRL scheme,利用了人工势场(APF)来改善训练时间和障碍物避免性能。构建了模拟环境,用于无人机可解释DRL-based规划和导航,包括障碍物和对抗性攻击。对抗性攻击采用基本迭代法(BIM)算法生成,在减少障碍物完成率从97%降至35%的情况下。提出了两种对抗性攻击探测器来抵抗这种降低。第一个是卷积神经网络对抗性检测器(CNN-AD),可以在检测上达到80%的准确率。第二个探测器利用长短时记忆(LSTM)网络。它实现了91%的准确率,计算时间比CNN-AD更快,可以实现实时对抗措施检测。