MIXRTs:通过混合经常软决定树,走向可解释多机构强化学习 (MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees) - 专知论文

会员服务 ·

0

SOFT · Learning · 混合 · 值域 · 知识 (knowledge) ·

2022 年 12 月 31 日

MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees

翻译：MIXRTs:通过混合经常软决定树,走向可解释多机构强化学习

Zichuan Liu,Yuanyang Zhu,Zhi Wang,Yang Gao,Chunlin Chen

Multi-agent reinforcement learning (MARL) recently has achieved tremendous success in a wide range of fields. However, with a black-box neural network architecture, existing MARL methods make decisions in an opaque fashion that hinders humans from understanding the learned knowledge and how input observations influence decisions. Our solution is MIXing Recurrent soft decision Trees (MIXRTs), a novel interpretable architecture that can represent explicit decision processes via the root-to-leaf path of decision trees. We introduce a novel recurrent structure in soft decision trees to address partial observability, and estimate joint action values via linearly mixing outputs of recurrent trees based on local observations only. Theoretical analysis shows that MIXRTs guarantees the structural constraint with additivity and monotonicity in factorization. We evaluate MIXRTs on a range of challenging StarCraft II tasks. Experimental results show that our interpretable learning framework obtains competitive performance compared to widely investigated baselines, and delivers more straightforward explanations and domain knowledge of the decision processes.

翻译：多剂强化学习(MARL)最近在许多领域取得了巨大成功,然而,随着黑盒神经网络结构的建立,现有MARL方法以不透明的方式作出决定,阻碍人类理解所学知识和投入观察如何影响决策。我们的解决办法是混合经常软决定树(MIXRTs),这是一个新的解释性架构,可以通过决策树的根对叶路径代表明确的决策过程。我们在软决策树中引入了一个新的经常性结构,以解决部分可耐性,并通过线性混合仅以当地观察为基础的经常树产出来估计联合行动值。理论分析表明,MIXRTs保证结构限制,在因素化中具有增加性和单一性。我们评估一系列具有挑战性的StarCraft II任务 MIXRTs。实验结果显示,我们的可解释性学习框架与广泛调查的基线相比,取得了竞争性的绩效,并提供了更直接的解释和对决策过程的域知识。

0

相关内容

SOFT

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

缺陷与晶面取向在光催化过程中的耦合作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于时空域模型分解策略的流程企业级协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

局部半完全有向图的分解及相关问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

III-V族半导体异质结构二维电子气的自旋输运特性

国家自然科学基金

0+阅读 · 2012年12月31日

基于磁层卫星和地面观测与太阳日冕遥测的磁场重联研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于multi-agent技术的地下铀矿山安全系统动态协调机制设计与群体智能优化控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

博弈分析下的供应链跨组织再造策略研究

国家自然科学基金

1+阅读 · 2008年12月31日

Robust Secrecy via Aerial Reflection and Jamming: Joint Optimization of Deployment and Transmission

Arxiv

0+阅读 · 2023年2月28日

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

Arxiv

0+阅读 · 2023年2月27日

A Survey on Causal Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Rearranged dependence measures

Arxiv

0+阅读 · 2023年2月26日

Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards

Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards

Arxiv

0+阅读 · 2023年2月24日

Meta Learning in Decentralized Neural Networks: Towards More General AI

Arxiv

0+阅读 · 2023年2月24日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Robust Secrecy via Aerial Reflection and Jamming: Joint Optimization of Deployment and Transmission

Arxiv

0+阅读 · 2023年2月28日

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

Arxiv

0+阅读 · 2023年2月27日

A Survey on Causal Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Rearranged dependence measures

Arxiv

0+阅读 · 2023年2月26日

Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards

Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards

Arxiv

0+阅读 · 2023年2月24日

Meta Learning in Decentralized Neural Networks: Towards More General AI

Arxiv

0+阅读 · 2023年2月24日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

相关基金

缺陷与晶面取向在光催化过程中的耦合作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于时空域模型分解策略的流程企业级协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

局部半完全有向图的分解及相关问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

III-V族半导体异质结构二维电子气的自旋输运特性

国家自然科学基金

0+阅读 · 2012年12月31日

基于磁层卫星和地面观测与太阳日冕遥测的磁场重联研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于multi-agent技术的地下铀矿山安全系统动态协调机制设计与群体智能优化控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

博弈分析下的供应链跨组织再造策略研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员