An often neglected issue in multi-agent reinforcement learning (MARL) is the potential presence of unreliable agents in the environment whose deviations from expected behavior can prevent a system from accomplishing its intended tasks. In particular, consensus is a fundamental underpinning problem of cooperative distributed multi-agent systems. Consensus requires different agents, situated in a decentralized communication network, to reach an agreement out of a set of initial proposals that they put forward. Learning-based agents should adopt a protocol that allows them to reach consensus despite having one or more unreliable agents in the system. This paper investigates the problem of unreliable agents in MARL, considering consensus as case study. Echoing established results in the distributed systems literature, our experiments show that even a moderate fraction of such agents can greatly impact the ability of reaching consensus in a networked environment. We propose Reinforcement Learning-based Trusted Consensus (RLTC), a decentralized trust mechanism, in which agents can independently decide which neighbors to communicate with. We empirically demonstrate that our trust mechanism is able to deal with unreliable agents effectively, as evidenced by higher consensus success rates.
 翻译:多试剂强化学习(MARL)中经常被忽视的一个问题是,在环境中存在不可靠的代理人,其偏离预期行为可能会妨碍一个系统完成预定任务,特别是,协商一致是合作分布多试剂系统的根本基础问题。共识要求位于分散通信网络的不同代理人根据他们提出的一套初步建议达成协议。基于学习的代理人应当通过一项议定书,允许他们在系统中有一个或一个以上不可靠的代理人的情况下达成共识。本文件调查了不可靠的代理人在综合实验室中的问题,认为协商一致是案例研究。我们实验表明,即使是一小部分此类代理人也能大大影响在网络化环境中达成共识的能力。我们提议,基于学习的信托共识(RLTC)是一个分散的信托机制,在这种机制中,代理人可以独立决定与哪个邻居进行沟通。我们的经验证明,我们的信任机制能够有效地处理不可靠的代理人,正如更高的协商一致成功率所证明的那样。