将差异视为对大赦国际合作的挑战 (Normative Disagreement as a Challenge for Cooperative AI) - 专知论文

会员服务 ·

0

回合 · 稳健性 · 注意力机制 · 学成 · 情景 ·

2021 年 11 月 27 日

Normative Disagreement as a Challenge for Cooperative AI

翻译：将差异视为对大赦国际合作的挑战

Julian Stastny,Maxime Riché,Alexander Lyzhov,Johannes Treutlein,Allan Dafoe,Jesse Clifton

from arxiv, Accepted at the Cooperative AI workshop and the Strategic ML workshop at NeurIPS 2021

Cooperation in settings where agents have both common and conflicting interests (mixed-motive environments) has recently received considerable attention in multi-agent learning. However, the mixed-motive environments typically studied have a single cooperative outcome on which all agents can agree. Many real-world multi-agent environments are instead bargaining problems (BPs): they have several Pareto-optimal payoff profiles over which agents have conflicting preferences. We argue that typical cooperation-inducing learning algorithms fail to cooperate in BPs when there is room for normative disagreement resulting in the existence of multiple competing cooperative equilibria, and illustrate this problem empirically. To remedy the issue, we introduce the notion of norm-adaptive policies. Norm-adaptive policies are capable of behaving according to different norms in different circumstances, creating opportunities for resolving normative disagreement. We develop a class of norm-adaptive policies and show in experiments that these significantly increase cooperation. However, norm-adaptiveness cannot address residual bargaining failure arising from a fundamental tradeoff between exploitability and cooperative robustness.

翻译：在代理人有着共同和相互冲突的利益(混合-运动环境)的环境下,合作近来在多代理人的学习中受到相当重视,然而,通常研究的混合运动环境有一个所有代理人都能同意的单一合作结果。许多现实世界的多代理人环境是讨价还价(BPs):它们有几个最佳的回报特征,而代理人的偏好是相互冲突的。我们争辩说,典型的合作-引导学习算法在BPs没有合作,因为有在规范方面出现分歧的余地,导致存在多种相互竞争的合作平衡,并用经验来说明这个问题。为了纠正这个问题,我们引入了规范-适应政策的概念。规范-适应政策能够在不同情况下按照不同的规范行事,创造解决规范分歧的机会。我们制定了一套规范-适应政策,并在实验中表明这些能够大大增强合作。然而,规范-适应性不能解决由于剥削与合作的稳健性之间的根本权衡而导致的剩余谈判失败。

0

相关内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

专知会员服务

181+阅读 · 2020年6月23日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

专知

4+阅读 · 2018年4月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

Arxiv

0+阅读 · 2022年2月1日

Cooperative learning for multi-view analysis

Arxiv

0+阅读 · 2022年1月31日

Generalization in Cooperative Multi-Agent Systems

Arxiv

0+阅读 · 2022年1月31日

CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年1月31日

AI in Finance: Challenges, Techniques and Opportunities

Arxiv

46+阅读 · 2021年7月20日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Arxiv

3+阅读 · 2020年7月3日

The StarCraft Multi-Agent Challenge

The StarCraft Multi-Agent Challenge

Arxiv

3+阅读 · 2019年2月11日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

最新《可解释人工智能XAI：机会与挑战》25页pdf，Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

专知会员服务

181+阅读 · 2020年6月23日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

专知

4+阅读 · 2018年4月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

Arxiv

0+阅读 · 2022年2月1日

Cooperative learning for multi-view analysis

Arxiv

0+阅读 · 2022年1月31日

Generalization in Cooperative Multi-Agent Systems

Arxiv

0+阅读 · 2022年1月31日

CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年1月31日

AI in Finance: Challenges, Techniques and Opportunities

Arxiv

46+阅读 · 2021年7月20日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Arxiv

3+阅读 · 2020年7月3日

The StarCraft Multi-Agent Challenge

The StarCraft Multi-Agent Challenge

Arxiv

3+阅读 · 2019年2月11日

微信扫码咨询专知VIP会员