确定袭击深强化学习的特点 (Characterizing Attacks on Deep Reinforcement Learning)

Recent studies show that Deep Reinforcement Learning (DRL) models are vulnerable to adversarial attacks, which attack DRL models by adding small perturbations to the observations. However, some attacks assume full availability of the victim model, and some require a huge amount of computation, making them less feasible for real world applications. In this work, we make further explorations of the vulnerabilities of DRL by studying other aspects of attacks on DRL using realistic and efficient attacks. First, we adapt and propose efficient black-box attacks when we do not have access to DRL model parameters. Second, to address the high computational demands of existing attacks, we introduce efficient online sequential attacks that exploit temporal consistency across consecutive steps. Third, we explore the possibility of an attacker perturbing other aspects in the DRL setting, such as the environment dynamics. Finally, to account for imperfections in how an attacker would inject perturbations in the physical world, we devise a method for generating a robust physical perturbations to be printed. The attack is evaluated on a real-world robot under various conditions. We conduct extensive experiments both in simulation such as Atari games, robotics and autonomous driving, and on real-world robotics, to compare the effectiveness of the proposed attacks with baseline approaches. To the best of our knowledge, we are the first to apply adversarial attacks on DRL systems to physical robots.

翻译：最近的研究表明,深强化学习(DRL)模型很容易受到对抗性攻击,这种攻击通过在观测中增加小扰动来攻击DRL模型。然而,一些攻击假设完全具备受害者模型,有些则需要大量计算,使其在现实世界应用中不那么可行。在这项工作中,我们进一步探索DRL的脆弱性,方法是研究使用现实而有效的攻击DRL袭击的其他方面。首先,当我们无法获得DRL模型参数时,我们调整并提出高效的黑箱攻击。第二,为了解决现有攻击的高计算需求,我们引入高效的在线顺序攻击,利用连续步骤之间的时间一致性。第三,我们探索了攻击者在DRL环境中干扰其他方面的可能性,例如环境动态。最后,为了说明攻击者如何在实际世界中引起扰动。我们设计了一种方法来产生强大的物理扰动。攻击是在各种条件下对现实世界机器人进行评估的。我们进行了广泛的在线连续攻击实验,利用时间的一致性。第三,我们探索了攻击者在DRL环境中破坏其他方面,例如环境动态。最后,为了解释攻击者如何在模拟机器人攻击中进行最佳的模型和机器人攻击。

相关内容

深度强化学习

关注 154

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日