In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given dataset. We study reward-poisoning attacks in this setting where an exogenous attacker modifies the rewards in the dataset before the agents see the dataset. The attacker wants to guide each agent into a nefarious target policy while minimizing the $L^p$ norm of the reward modification. Unlike attacks on single-agent RL, we show that the attacker can install the target policy as a Markov Perfect Dominant Strategy Equilibrium (MPDSE), which rational agents are guaranteed to follow. This attack can be significantly cheaper than separate single-agent attacks. We show that the attack works on various MARL agents including uncertainty-aware learners, and we exhibit linear programs to efficiently solve the attack problem. We also study the relationship between the structure of the datasets and the minimal attack cost. Our work paves the way for studying defense in offline MARL.
翻译:在离线多试剂强化学习中,代理商从给定的数据集中估算政策。 我们在此环境下研究奖励- 溢价攻击, 外部攻击者在他们看到数据集之前修改数据集中的奖赏。 攻击者想引导每个代理商制定邪恶的目标政策, 同时尽量减少奖励修改的$Lp$标准。 与对单试剂RL的攻击不同, 我们显示攻击者可以将目标政策安装为Markov Perfect Dominant战略平衡( MPDSE ), 保证理性的代理商会遵守这一政策。 这次攻击比单独的单一试剂攻击要便宜得多。 我们显示包括不确定性学习者在内的各种MAL 代理商的攻击工程, 我们展示了有效解决袭击问题的线性程序。 我们还研究了数据集的结构与最低攻击成本之间的关系。 我们的工作为在离线MARL进行防御研究铺平了道路。