可解释的动作建议在多智能体强化学习中的应用 (Explainable Action Advising for Multi-Agent Reinforcement Learning) - 专知论文

会员服务 ·

0

智能体 · 多智能体 · 多智能体强化学习 · 强化学习 · 收敛速度 ·

2023 年 4 月 3 日

Explainable Action Advising for Multi-Agent Reinforcement Learning

翻译：可解释的动作建议在多智能体强化学习中的应用

Yue Guo,Joseph Campbell,Simon Stepputtis,Ruiyu Li,Dana Hughes,Fei Fang,Katia Sycara

from arxiv, This work has been accepted to ICRA 2023

Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student's sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel states. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling advice generalization and leading to improved sample efficiency and learning performance - even in environments where the teacher is sub-optimal. We empirically show that our framework is effective in both single-agent and multi-agent scenarios, yielding improved policy returns and convergence rates when compared to state-of-the-art methods

翻译：动作建议是一种基于教师-学生范式的强化学习知识传递技术。专家教师在训练过程中向学生提供建议，以提高学生的样本效率和策略性能。这种建议通常以状态-动作对的形式给出。然而，这使得学生难以理解，并难以应用于新颖的状态。我们引入了“可解释的动作建议”，其中教师提供的动作建议以及相关的说明说明了为什么选择该动作。这使得学生能够自我反思学到的东西，实现建议的泛化，从而提高样本效率和学习性能-即使在教师是次优的环境下也是如此。我们通过实验证明了我们的框架在单智能体和多智能体场景中都是有效的，相较于最先进的方法，它产生了更好的策略回报和收敛速度。

0

相关内容

智能体

智能体，顾名思义，就是具有智能的实体，英文名是Agent。

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

专知

78+阅读 · 2018年8月7日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

应用自体富血小板血浆和软骨细胞膜片构建人工耳廓软骨

国家自然科学基金

0+阅读 · 2013年12月31日

从影响RIP1泛素化探讨p62在人卵巢癌细胞顺铂耐药机制中作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

软件外包项目知识整合的实证研究-从交互记忆和知识边界的视角

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

高压下碳的晶体结构与性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

教师隐性知识的来源、心理特征及其认知神经机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

切换随机系统的实用稳定性与输入状态稳定性

国家自然科学基金

0+阅读 · 2009年12月31日

企业高层管理人员更替决策研究

国家自然科学基金

0+阅读 · 2009年12月31日

成年大鼠恐惧记忆新机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Video Prediction Models as Rewards for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月23日

Deep Reinforcement Learning-based Multi-objective Path Planning on the Off-road Terrain Environment for Ground Vehicles

Arxiv

0+阅读 · 2023年5月23日

Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Arxiv

0+阅读 · 2023年5月22日

Self-Reinforcement Attention Mechanism For Tabular Learning

Arxiv

0+阅读 · 2023年5月19日

A Comprehensive Survey on Source-free Domain Adaptation

Arxiv

10+阅读 · 2023年2月23日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

多智能体强化学习

相关VIP内容

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

专知

78+阅读 · 2018年8月7日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Video Prediction Models as Rewards for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月23日

Deep Reinforcement Learning-based Multi-objective Path Planning on the Off-road Terrain Environment for Ground Vehicles

Arxiv

0+阅读 · 2023年5月23日

Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Arxiv

0+阅读 · 2023年5月22日

Self-Reinforcement Attention Mechanism For Tabular Learning

Arxiv

0+阅读 · 2023年5月19日

A Comprehensive Survey on Source-free Domain Adaptation

Arxiv

10+阅读 · 2023年2月23日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

应用自体富血小板血浆和软骨细胞膜片构建人工耳廓软骨

国家自然科学基金

0+阅读 · 2013年12月31日

从影响RIP1泛素化探讨p62在人卵巢癌细胞顺铂耐药机制中作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

软件外包项目知识整合的实证研究-从交互记忆和知识边界的视角

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

高压下碳的晶体结构与性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

教师隐性知识的来源、心理特征及其认知神经机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

切换随机系统的实用稳定性与输入状态稳定性

国家自然科学基金

0+阅读 · 2009年12月31日

企业高层管理人员更替决策研究

国家自然科学基金

0+阅读 · 2009年12月31日

成年大鼠恐惧记忆新机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员