Embodied agents in vision navigation coupled with deep neural networks have attracted increasing attention. However, deep neural networks have been shown vulnerable to malicious adversarial noises, which may potentially cause catastrophic failures in Embodied Vision Navigation. Among different adversarial noises, universal adversarial perturbations (UAP), i.e., a constant image-agnostic perturbation applied on every input frame of the agent, play a critical role in Embodied Vision Navigation since they are computation-efficient and application-practical during the attack. However, existing UAP methods ignore the system dynamics of Embodied Vision Navigation and might be sub-optimal. In order to extend UAP to the sequential decision setting, we formulate the disturbed environment under the universal noise $\delta$, as a $\delta$-disturbed Markov Decision Process ($\delta$-MDP). Based on the formulation, we analyze the properties of $\delta$-MDP and propose two novel Consistent Attack methods, named Reward UAP and Trajectory UAP, for attacking Embodied agents, which consider the dynamic of the MDP and calculate universal noises by estimating the disturbed distribution and the disturbed Q function. For various victim models, our Consistent Attack can cause a significant drop in their performance in the PointGoal task in Habitat with different datasets and different scenes. Extensive experimental results indicate that there exist serious potential risks for applying Embodied Vision Navigation methods to the real world.
翻译:机器视觉导航中的身体官能联合深度神经网络持续备受关注。然而,深度神经网络已经表现出对恶意对抗性噪声的易受攻击性,这可能在机器人视觉导航中导致灾难性失败。在不同的对抗噪声中,通用对抗扰动(UAP)即不考虑图像内容的恒定扰动被广泛应用于攻击,由于其在攻击过程中计算效率高、应用实用等特点。然而,现有的UAP方法忽略了机器人视觉导航的系统动力学,因此可能是次优的。为了将UAP推广到序列决策环境中,我们将带有通用噪声δ的环境作为一个δ扰动的MDP(马尔可夫决策过程)进行描述。基于此描述,我们分析了δ-MDP的特性,并提出了两种新的一致性攻击方法,命名为Reward UAP和Trajectory UAP,用于攻击机器人导航代理,这两种方法考虑了MDP的动态性,并通过估计扰动分布和受扰动的Q函数计算通用噪声。对不同的受害者模型,在Habitat的PointGoal任务中,我们的一致性攻击能够对它们的性能造成显著影响,且在不同的数据集和场景下效果良好。大量的实验结果表明,将机器视觉导航方法应用于实际环境中存在严重的潜在风险。