Finding a balance between collaboration and competition is crucial for artificial agents in many real-world applications. We investigate this using a Multi-Agent Reinforcement Learning (MARL) setup on the back of a high-impact problem. The accumulation and yearly growth of plastic in the ocean cause irreparable damage to many aspects of oceanic health and the marina system. To prevent further damage, we need to find ways to reduce macroplastics from known plastic patches in the ocean. Here we propose a Graph Neural Network (GNN) based communication mechanism that increases the agents' observation space. In our custom environment, agents control a plastic collecting vessel. The communication mechanism enables agents to develop a communication protocol using a binary signal. While the goal of the agent collective is to clean up as much as possible, agents are rewarded for the individual amount of macroplastics collected. Hence agents have to learn to communicate effectively while maintaining high individual performance. We compare our proposed communication mechanism with a multi-agent baseline without the ability to communicate. Results show communication enables collaboration and increases collective performance significantly. This means agents have learned the importance of communication and found a balance between collaboration and competition.
翻译:众所周知,在许多真实世界的场景下,人工智能智能体需要寻找合作和竞争的平衡点。本文以超高影响力问题为基础,通过多智能体强化学习(MARL)探究这一问题。海洋塑料垃圾的积累和年增长对海洋健康和海洋系统造成了不可逆转的破坏。为了防止进一步的破坏,我们需要寻找方法来清除海洋塑料垃圾。本文提出了一种基于图神经网络(GNN)的通信机制,可以增加机器人智能体的观察空间。在自定义的环境中,机器人智能体控制一个塑料回收器。通信机制使机器人智能体可以使用二进制信号开发通信协议。尽管机器人智能体的目标是尽可能清理垃圾,但每个机器人智能体的收集到的垃圾量也会受到奖励。因此,机器人智能体在保持高个体表现的同时,必须学会有效地交流。我们将提出的通信机制与无通信能力的多智能体基线进行比较。实验结果表明,通信增加了协作,显著提高了集体表现。这意味着机器人智能体已学会了交流的重要性,并找到了合作和竞争之间的平衡点。