对加强深强化学习政策的实时反干扰:攻击和防御 (Real-time Adversarial Perturbations against Deep Reinforcement Learning Policies: Attacks and Defenses)

Recent work has shown that deep reinforcement learning (DRL) policies are vulnerable to adversarial perturbations. Adversaries can mislead policies of DRL agents by perturbing the state of the environment observed by the agents. Existing attacks are feasible in principle but face challenges in practice, for example by being too slow to fool DRL policies in real time. We show that using the Universal Adversarial Perturbation (UAP) method to compute perturbations, independent of the individual inputs to which they are applied to, can fool DRL policies effectively and in real time. We describe three such attack variants. Via an extensive evaluation using three Atari 2600 games, we show that our attacks are effective, as they fully degrade the performance of three different DRL agents (up to 100%, even when the $l_\infty$ bound on the perturbation is as small as 0.01). It is faster compared to the response time (0.6ms on average) of different DRL policies, and considerably faster than prior attacks using adversarial perturbations (1.8ms on average). We also show that our attack technique is efficient, incurring an online computational cost of 0.027ms on average. Using two further tasks involving robotic movement, we confirm that our results generalize to more complex DRL tasks. Furthermore, we demonstrate that the effectiveness of known defenses diminishes against universal perturbations. We propose an effective technique that detects all known adversarial perturbations against DRL policies, including all the universal perturbations presented in this paper.

翻译：最近的工作表明,深度强化学习(DRL)政策很容易受到对抗性干扰。反差可以通过干扰代理人观察到的环境状况来误导DRL代理商的政策。现有的攻击原则上是可行的,但在实践中却面临挑战,例如,在实时中,这种攻击过于缓慢,无法愚弄DRL政策。我们表明,使用通用反反调(UAP)方法来计算扰动,不考虑它们所应用的个别投入,可以有效和实时地愚弄DRL政策。我们描述了三种这样的攻击变异。我们用3场Atari 2600游戏对DRL代理商的政策进行了广泛的评价,显示我们的攻击是有效的,因为它们完全降低了DRL三个不同的代理商的性能(高达100 %,即使受入侵约束的$-infty(UAP)小于0.01美元)。与不同的DRL政策的反应时间(平均为0.6米)相比,不同DR政策比以往的攻击要快得多,我们用对抗的对抗性反差的PERL政策(平均为1.8米/每场运动1.20米),我们还在平均地展示了一种高效的计算方法。我们用来计算出我们所知道的平价平比平比平比平平平的平比平的平的计算结果。我们更精确的计算。我们知道的计算方法,我们更能的计算。我们更精确地证明了了我们使用两种计算方法。我们用来了两种计算。我们用来计算。我们知道的计算方法, 。我们使用两种计算方法, 。我们用来在平均的计算方法, 。我们用两种计算方法, 。我们用来证明我们知道的计算出我们使用两种计算方法, 。我们用来进一步的计算整个的平平平平的计算方法, 。

相关内容

深度强化学习

关注 156

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日