对抗政策:攻击深强化学习 (Adversarial Policies: Attacking Deep Reinforcement Learning)

Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers. However, an attacker is not usually able to directly modify another agent's observations. This might lead one to wonder: is it possible to attack an RL agent simply by choosing an adversarial policy acting in a multi-agent environment so as to create natural observations that are adversarial? We demonstrate the existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents. The adversarial policies reliably win against the victims but generate seemingly random and uncoordinated behavior. We find that these policies are more successful in high-dimensional environments, and induce substantially different activations in the victim policy network than when the victim plays against a normal opponent. Videos are available at https://adversarialpolicies.github.io/.

翻译：深入强化学习(RL)政策被认为很容易受到与其观察相类似的对立干扰,类似于分类者的对立实例。然而,攻击者通常无法直接修改另一个代理人的观察。这可能导致人们怀疑:仅仅通过在多试剂环境中选择对抗性政策来攻击RL代理商,从而产生对抗性自然观察,从而在多试剂环境中产生对抗性观察,这样是否可能攻击RL代理商?我们证明,在模拟人造机器人之间零和游戏中存在对抗性政策,并进行自行观察,对抗通过自我游戏训练而变得对对手强大的最先进的受害者。对抗性政策可靠地战胜了受害者,但产生了似乎随机和不协调的行为。我们发现,这些政策在高维环境中比较成功,在受害者政策网络中产生与受害者对普通对手的对抗性更截然不同的动力。视频可在 https://对抗性对抗性政策.github.io/上查阅。

相关内容

深度强化学习

关注 154

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【ICML2020-伯克利】稳定非策略强化学习的表示，Representations for Stable Off-Policy Reinforcement Learning

专知会员服务

17+阅读 · 2020年7月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日