超越奖励:从等级角度看离线多剂行为分析 (Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis) - 专知论文

会员服务 ·

0

Agent · Analysis · Learning · 运动行为分析 · 可理解性 ·

2022 年 9 月 1 日

Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis

翻译：超越奖励:从等级角度看离线多剂行为分析

Shayegan Omidshafiei,Andrei Kapishnikov,Yannick Assogba,Lucas Dixon,Been Kim

Each year, expert-level performance is attained in increasingly-complex multiagent domains, notable examples including Go, Poker, and StarCraft II. This rapid progression is accompanied by a commensurate need to better understand how such agents attain this performance, to enable their safe deployment, identify limitations, and reveal potential means of improving them. In this paper we take a step back from performance-focused multiagent learning, and instead turn our attention towards agent behavior analysis. We introduce a model-agnostic method for discovery of behavior clusters in multiagent domains, using variational inference to learn a hierarchy of behaviors at the joint and local agent levels. Our framework makes no assumption about agents' underlying learning algorithms, does not require access to their latent states or policies, and is trained using only offline observational data. We illustrate the effectiveness of our method for enabling the coupled understanding of behaviors at the joint and local agent level, detection of behavior changepoints throughout training, discovery of core behavioral concepts, demonstrate the approach's scalability to a high-dimensional multiagent MuJoCo control domain, and also illustrate that the approach can disentangle previously-trained policies in OpenAI's hide-and-seek domain.

翻译：每年,专家一级的业绩都是在日益复杂的多试剂领域取得的,包括Go、Poker和StarCraft II等显著的例子。这一快速进展伴随着一种相应的需要,即更好地了解这些代理人如何取得这种业绩,以便能够安全地部署,查明局限性,并揭示可能的改进手段。在本文件中,我们从注重业绩的多试剂学习中退一步,转而将注意力转向代理人行为分析。我们引入了一种在多试剂领域发现行为集群的模型-不可知性方法,采用不同的推论来学习联合和地方代理人一级的行为等级。我们的框架不假定代理人的基本学习算法,不要求接触其潜在的状态或政策,而只接受离线观测数据的培训。我们展示了我们在联合和地方代理人一级能够同时理解行为的方法的有效性,在整个培训过程中探测行为变化点,发现核心行为概念,表明该方法对于高层次的多试剂MuJoco控制领域的可调整性。我们还说明,该方法能够打破OpenAI的隐藏域内存政策。

0

相关内容

Agent

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

趋化因子免疫调控网络系统在口腔扁平苔藓发病机制中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

海洋弧菌菌群感应信号分子N-acyl homoserine lactones对NK细胞的调控作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Nrf2/ARE途径探讨黄芪苷IV对呼吸道合胞病毒感染的抑制作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

瘦长红珊瑚(corallium elatius)毫米-纳米多级结构特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

大功率多单元永磁同步电机的无机械负荷测试方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

基于量子点标记荧光生物探针的星形胶质细胞吞噬β28096;粉样肽的动态可视化新方法

国家自然科学基金

0+阅读 · 2009年12月31日

Bootstrapped Transformer for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年10月18日

Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation

Arxiv

0+阅读 · 2022年10月17日

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2022年10月17日

On the Identifiability of Nonlinear ICA: Sparsity and Beyond

Arxiv

0+阅读 · 2022年10月16日

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月15日

LEADER: Learning Attention over Driving Behaviors for Planning under Uncertainty

Arxiv

0+阅读 · 2022年10月15日

E2R: a Hierarchical-Learning inspired Novelty-Search method to generate diverse repertoires of grasping trajectories

Arxiv

0+阅读 · 2022年10月14日

Learning Active Camera for Multi-Object Navigation

Arxiv

0+阅读 · 2022年10月14日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

VIP会员

文章信息

相关主题

运动行为分析

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Bootstrapped Transformer for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年10月18日

Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation

Arxiv

0+阅读 · 2022年10月17日

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2022年10月17日

On the Identifiability of Nonlinear ICA: Sparsity and Beyond

Arxiv

0+阅读 · 2022年10月16日

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月15日

LEADER: Learning Attention over Driving Behaviors for Planning under Uncertainty

Arxiv

0+阅读 · 2022年10月15日

E2R: a Hierarchical-Learning inspired Novelty-Search method to generate diverse repertoires of grasping trajectories

Arxiv

0+阅读 · 2022年10月14日

Learning Active Camera for Multi-Object Navigation

Arxiv

0+阅读 · 2022年10月14日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

相关基金

趋化因子免疫调控网络系统在口腔扁平苔藓发病机制中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

海洋弧菌菌群感应信号分子N-acyl homoserine lactones对NK细胞的调控作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Nrf2/ARE途径探讨黄芪苷IV对呼吸道合胞病毒感染的抑制作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

瘦长红珊瑚(corallium elatius)毫米-纳米多级结构特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

大功率多单元永磁同步电机的无机械负荷测试方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

基于量子点标记荧光生物探针的星形胶质细胞吞噬β28096;粉样肽的动态可视化新方法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员