Poisoning attacks on Reinforcement Learning (RL) systems could take advantage of RL algorithm's vulnerabilities and cause failure of the learning. However, prior works on poisoning RL usually either unrealistically assume the attacker knows the underlying Markov Decision Process (MDP), or directly apply the poisoning methods in supervised learning to RL. In this work, we build a generic poisoning framework for online RL via a comprehensive investigation of heterogeneous poisoning models in RL. Without any prior knowledge of the MDP, we propose a strategic poisoning algorithm called Vulnerability-Aware Adversarial Critic Poison (VA2C-P), which works for most policy-based deep RL agents, closing the gap that no poisoning method exists for policy-based RL agents. VA2C-P uses a novel metric, stability radius in RL, that measures the vulnerability of RL algorithms. Experiments on multiple deep RL agents and multiple environments show that our poisoning algorithm successfully prevents agents from learning a good policy or teaches the agents to converge to a target policy, with a limited attacking budget.
翻译:对强化学习系统进行毒害袭击可能利用RL算法的脆弱性,并造成学习失败。然而,先前的中毒RL工作通常不切实际地假设攻击者知道基本的Markov决策程序(MDP),或者直接在监督下对RL进行学习时采用中毒方法。 在这项工作中,我们通过对RL的多种不同中毒模式进行全面调查,为在线RL建立一个普通中毒框架。 在对MDP不事先了解的情况下,我们提议了一种称为脆弱性-Aware Aware Aversarial Critial Courty(VA2C-P)的战略中毒算法(VA2C-P),该算法对大多数基于政策的深度RL代理起作用,缩小了基于政策的RL代理没有中毒方法存在的差距。 VA2C-P在RL使用一种新的指标、稳定性半径,以测量RL算法的脆弱性。对多个深度RL代理和多个环境的实验表明,我们的中毒算法成功地阻止了代理人学习好的政策,或教给代理人学习目标政策,而攻击预算有限。