Reinforcement learning algorithms, just like any other Machine learning algorithm pose a serious threat from adversaries. The adversaries can manipulate the learning algorithm resulting in non-optimal policies. In this paper, we analyze the Multi-task Federated Reinforcement Learning algorithms, where multiple collaborative agents in various environments are trying to maximize the sum of discounted return, in the presence of adversarial agents. We argue that the common attack methods are not guaranteed to carry out a successful attack on Multi-task Federated Reinforcement Learning and propose an adaptive attack method with better attack performance. Furthermore, we modify the conventional federated reinforcement learning algorithm to address the issue of adversaries that works equally well with and without the adversaries. Experimentation on different small to mid-size reinforcement learning problems show that the proposed attack method outperforms other general attack methods and the proposed modification to federated reinforcement learning algorithm was able to achieve near-optimal policies in the presence of adversarial agents.
翻译:强化学习算法,就像任何其他机器学习算法一样,对对手构成严重威胁。对手可以操纵学习算法,导致非最佳政策。在本文中,我们分析了多任务联邦强化学习算法,在不同环境中,多个合作者试图在敌对方在场的情况下,最大限度地实现折扣回报的总和。我们争辩说,共同攻击方法不能保证对多任务联合强化学习成功进行攻击,而建议采用适应性攻击方法,以更好的攻击性能。此外,我们修改传统的联合强化学习算法,以解决对敌对方同样有效而且没有对手的敌对方问题。对不同中小型强化学习问题的实验表明,拟议的攻击方法优于其他一般攻击方法,对联合增援学习算法的拟议修改也能够在对抗方在场的情况下实现接近最佳的政策。