Fairness is essential for human society, contributing to stability and productivity. Similarly, fairness is also the key for many multi-agent systems. Taking fairness into multi-agent learning could help multi-agent systems become both efficient and stable. However, learning efficiency and fairness simultaneously is a complex, multi-objective, joint-policy optimization. To tackle these difficulties, we propose FEN, a novel hierarchical reinforcement learning model. We first decompose fairness for each agent and propose fair-efficient reward that each agent learns its own policy to optimize. To avoid multi-objective conflict, we design a hierarchy consisting of a controller and several sub-policies, where the controller maximizes the fair-efficient reward by switching among the sub-policies that provides diverse behaviors to interact with the environment. FEN can be trained in a fully decentralized way, making it easy to be deployed in real-world applications. Empirically, we show that FEN easily learns both fairness and efficiency and significantly outperforms baselines in a variety of multi-agent scenarios.
翻译:公平是人类社会的关键,有利于稳定和生产力。同样,公平也是许多多试剂系统的关键。将公平纳入多试剂学习可以帮助多试剂系统变得高效和稳定。然而,学习效率和公平同时是一种复杂、多目标、共同政策优化。为了解决这些困难,我们提议FEN,一个新型的等级强化学习模式。我们首先对每个代理商进行公平分解,并提出公平有效的奖励建议,让每个代理商学习自己的最佳政策。为了避免多目标冲突,我们设计了一个由控制者和几个次级政策组成的等级,使控制者通过转换提供不同行为与环境互动的次级政策,最大限度地获得公平有效的奖励。FEN可以完全分散培训,使其易于被应用到现实世界的应用中。我们很生动地表明,FEN很容易在多种试样中学习公平和效率,并大大超出基线。