Recent years have witnessed the great success of multi-agent systems (MAS). Value decomposition, which decomposes joint action values into individual action values, has been an important work in MAS. However, many value decomposition methods ignore the coordination among different agents, leading to the notorious "lazy agents" problem. To enhance the coordination in MAS, this paper proposes HyperGraph CoNvolution MIX (HGCN-MIX), a method that incorporates hypergraph convolution with value decomposition. HGCN-MIX models agents as well as their relationships as a hypergraph, where agents are nodes and hyperedges among nodes indicate that the corresponding agents can coordinate to achieve larger rewards. Then, it trains a hypergraph that can capture the collaborative relationships among agents. Leveraging the learned hypergraph to consider how other agents' observations and actions affect their decisions, the agents in a MAS can better coordinate. We evaluate HGCN-MIX in the StarCraft II multi-agent challenge benchmark. The experimental results demonstrate that HGCN-MIX can train joint policies that outperform or achieve a similar level of performance as the current state-of-the-art techniques. We also observe that HGCN-MIX has an even more significant improvement of performance in the scenarios with a large amount of agents. Besides, we conduct additional analysis to emphasize that when the hypergraph learns more relationships, HGCN-MIX can train stronger joint policies.
翻译:多试剂系统(MAS)近年来取得了巨大成功。价值分解使联合行动的价值分解为个人行动的价值,是MAS的一项重要工作。然而,许多价值分解方法忽视了不同代理人之间的协调,导致臭名昭著的“懒惰剂”问题。为了加强MAS的协调,本文件提议了超格格夫-科革命MIX(HGCN-MIX),这是将高分解与价值分解相结合的一种方法。HGCN-MIX模型代理人及其作为高分解仪的关系,其中代理人是节点和节点之间的高端表明相应的代理人可以协调以获得更大的回报。随后,许多价值分解方法忽视了不同代理人之间的协调,导致“懒惰剂”问题。为了加强MAS的协调,本文件提议了超格调高调方法,将超格思多分解(HGCN-MIX)同高分解法混为一体,我们在StarCraft II多剂挑战基准中评估HGCN-MIX(HCN-MIX),实验结果表明,HCN-N-IX联合政策可以超越或取得类似程度的高级性表现,在HIX的高级分析中,而我们对H-C-C-C-C-SAL的大规模表现也观察了更高程度的成绩。