Inspired by applications such as supply chain management, epidemics, and social networks, we formulate a stochastic game model that addresses three key features common across these domains: 1) network-structured player interactions, 2) pair-wise mixed cooperation and competition among players, and 3) limited global information toward individual decision-making. In combination, these features pose significant challenges for black box approaches taken by deep learning-based multi-agent reinforcement learning (MARL) algorithms and deserve more detailed analysis. We formulate a networked stochastic game with pair-wise general sum objectives and asymmetrical information structure, and empirically explore the effects of information availability on the outcomes of different MARL paradigms such as individual learning and centralized learning decentralized execution.
翻译:在供应链管理、流行病和社会网络等应用的启发下,我们制定了一个随机游戏模式,处理这些领域共有的三大特征:(1) 网络结构参与者互动,(2) 参与者之间双向混合合作和竞争,(3) 个人决策方面的全球信息有限,这些特征对深层次学习多剂强化学习(MARL)算法采取的黑盒方法构成重大挑战,值得更详细分析。 我们制定了一个网络化的随机游戏,配对式总和目标和不对称信息结构,从经验上探索信息可得性对不同MARL模式结果的影响,如个人学习和集中学习分散执行等。