The growing literature of Federated Learning (FL) has recently inspired Federated Reinforcement Learning (FRL) to encourage multiple agents to federatively build a better decision-making policy without sharing raw trajectories. Despite its promising applications, existing works on FRL fail to I) provide theoretical analysis on its convergence, and II) account for random system failures and adversarial attacks. Towards this end, we propose the first FRL framework the convergence of which is guaranteed and tolerant to less than half of the participating agents being random system failures or adversarial attackers. We prove that the sample efficiency of the proposed framework is guaranteed to improve with the number of agents and is able to account for such potential failures or attacks. All theoretical results are empirically verified on various RL benchmark tasks.
翻译:联邦学习联合会(FL)的文献不断增长,最近激励了联邦加强学习联合会(FRL)鼓励多个机构在不分享原始轨迹的情况下,联合制定更好的决策政策。尽管其应用很有希望,但关于FRL的现有工作没有做到I)提供其趋同性的理论分析,以及二)说明其随机系统故障和对抗性攻击。为此,我们建议第一个FRL框架保证并容忍不到一半的参与机构为随机系统失灵或对抗性攻击者。我们证明,拟议框架的抽样效率保证随着代理人数量的增加而提高,并能够对这种潜在的失败或攻击作出解释。所有理论结果都经过各种RL基准任务的经验验证。