学习在竞争性多智能体环境中合作与交流以清理海洋塑料垃圾 (Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics) - 专知论文

会员服务 ·

0

智能体 · 机器人 · 多智能体 · 平衡点 · 可逆 ·

2023 年 4 月 12 日

Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics

翻译：学习在竞争性多智能体环境中合作与交流以清理海洋塑料垃圾

Philipp Dominic Siedler

from arxiv, Tackling Climate Change with Machine Learning Workshop at the 11th International Conference on Learning Representations (ICLR 2023)

Finding a balance between collaboration and competition is crucial for artificial agents in many real-world applications. We investigate this using a Multi-Agent Reinforcement Learning (MARL) setup on the back of a high-impact problem. The accumulation and yearly growth of plastic in the ocean cause irreparable damage to many aspects of oceanic health and the marina system. To prevent further damage, we need to find ways to reduce macroplastics from known plastic patches in the ocean. Here we propose a Graph Neural Network (GNN) based communication mechanism that increases the agents' observation space. In our custom environment, agents control a plastic collecting vessel. The communication mechanism enables agents to develop a communication protocol using a binary signal. While the goal of the agent collective is to clean up as much as possible, agents are rewarded for the individual amount of macroplastics collected. Hence agents have to learn to communicate effectively while maintaining high individual performance. We compare our proposed communication mechanism with a multi-agent baseline without the ability to communicate. Results show communication enables collaboration and increases collective performance significantly. This means agents have learned the importance of communication and found a balance between collaboration and competition.

翻译：众所周知，在许多真实世界的场景下，人工智能智能体需要寻找合作和竞争的平衡点。本文以超高影响力问题为基础，通过多智能体强化学习（MARL）探究这一问题。海洋塑料垃圾的积累和年增长对海洋健康和海洋系统造成了不可逆转的破坏。为了防止进一步的破坏，我们需要寻找方法来清除海洋塑料垃圾。本文提出了一种基于图神经网络（GNN）的通信机制，可以增加机器人智能体的观察空间。在自定义的环境中，机器人智能体控制一个塑料回收器。通信机制使机器人智能体可以使用二进制信号开发通信协议。尽管机器人智能体的目标是尽可能清理垃圾，但每个机器人智能体的收集到的垃圾量也会受到奖励。因此，机器人智能体在保持高个体表现的同时，必须学会有效地交流。我们将提出的通信机制与无通信能力的多智能体基线进行比较。实验结果表明，通信增加了协作，显著提高了集体表现。这意味着机器人智能体已学会了交流的重要性，并找到了合作和竞争之间的平衡点。

0

相关内容

智能体

智能体，顾名思义，就是具有智能的实体，英文名是Agent。

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

图与推荐

0+阅读 · 2022年11月14日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

lncRNA Rian在肺癌发生中的作用及与miR-370互作机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

频谱异构环境下基于协作感知的认知无线ad hoc网络MAC技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

物联网信息环境下铁路枢纽动态调度仿真优化理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ipr1基因介导巨噬细胞凋亡的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向ISM频段无线传感器网络的合作共存与优化技术

国家自然科学基金

0+阅读 · 2012年12月31日

北冰洋太平洋扇区海洋沉积放线菌多样性及其产新结构化合物潜力的研究

国家自然科学基金

0+阅读 · 2012年12月31日

苯并(a)芘暴露产活性氧介导海洋鱼类抗菌肽的表达机制及其在核转录因子信号通路中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

肌酸激酶与CC2D1A和NF-κB相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling

Arxiv

0+阅读 · 2023年5月30日

A Game of Competition for Risk

Arxiv

0+阅读 · 2023年5月30日

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月29日

Learning from Integral Losses in Physics Informed Neural Networks

Arxiv

0+阅读 · 2023年5月27日

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Arxiv

0+阅读 · 2023年5月27日

Research on Multi-Agent Communication and Collaborative Decision-Making Based on Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年5月23日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey on Green Deep Learning

Arxiv

10+阅读 · 2021年11月10日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

VIP会员

文章信息

相关主题

相关VIP内容

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《商用大语言模型的升级风险管理：国家安全运用》

【伯克利博士论文】通过真实世界实践赋能机器人自主性

《从装备到文化：美陆军技术素养建设启示录》最新报告

人工智能安全治理白皮书（2025）

相关资讯

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

图与推荐

0+阅读 · 2022年11月14日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling

Arxiv

0+阅读 · 2023年5月30日

A Game of Competition for Risk

Arxiv

0+阅读 · 2023年5月30日

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月29日

Learning from Integral Losses in Physics Informed Neural Networks

Arxiv

0+阅读 · 2023年5月27日

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Arxiv

0+阅读 · 2023年5月27日

Research on Multi-Agent Communication and Collaborative Decision-Making Based on Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年5月23日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey on Green Deep Learning

Arxiv

10+阅读 · 2021年11月10日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

相关基金

lncRNA Rian在肺癌发生中的作用及与miR-370互作机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

频谱异构环境下基于协作感知的认知无线ad hoc网络MAC技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

物联网信息环境下铁路枢纽动态调度仿真优化理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ipr1基因介导巨噬细胞凋亡的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向ISM频段无线传感器网络的合作共存与优化技术

国家自然科学基金

0+阅读 · 2012年12月31日

北冰洋太平洋扇区海洋沉积放线菌多样性及其产新结构化合物潜力的研究

国家自然科学基金

0+阅读 · 2012年12月31日

苯并(a)芘暴露产活性氧介导海洋鱼类抗菌肽的表达机制及其在核转录因子信号通路中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

肌酸激酶与CC2D1A和NF-κB相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员