Evaluating the worst-case performance of a reinforcement learning (RL) agent under the strongest/optimal adversarial perturbations on state observations (within some constraints) is crucial for understanding the robustness of RL agents. However, finding the optimal adversary is challenging, in terms of both whether we can find the optimal attack and how efficiently we can find it. Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or directly train an RL-based adversary by treating the agent as a part of the environment, which can find the optimal adversary but may become intractable in a large state space. This paper introduces a novel attacking method to find the optimal attacks through collaboration between a designed function named "actor" and an RL-based learner named "director". The actor crafts state perturbations for a given policy perturbation direction, and the director learns to propose the best policy perturbation directions. Our proposed algorithm, PA-AD, is theoretically optimal and significantly more efficient than prior RL-based works in environments with large state spaces. Empirical results show that our proposed PA-AD universally outperforms state-of-the-art attacking methods in various Atari and MuJoCo environments. By applying PA-AD to adversarial training, we achieve state-of-the-art empirical robustness in multiple tasks under strong adversaries.
翻译:在最强/最优化的对抗性干扰下,评估强化学习(RL)剂在最强/最优化的国家观测中最坏的性能,对于了解RL剂的稳健性至关重要。然而,找到最佳的对手是具有挑战性的,从我们能否找到最佳攻击力和我们能找到最佳攻击力的角度,从我们能否找到最佳攻击力的角度,到找到最佳攻击力,从我们能否找到最佳攻击力,到如何找到最佳攻击力。关于敌对的RL剂的现有工作,要么使用可能找不到最强对手的超动性方法,或者直接训练以RL为主的对手,将RL为环境的一部分,在较大的国家空间中找到最佳对手,但可能变得难以解决。本文介绍了一种新的攻击方法,通过设计功能“ator”和以RL为主的学习者“指挥力”之间的协作,找到最佳攻击力。 演员为某种特定政策动荡方向制造的干扰力,而主任则学会提出最佳的政策破坏方向。我们提议的算法,即PA-A-A-A-AD,比以前在强大的国家空间环境中的R-R-P-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-