高级模拟学习在线鲍姆-韦尔奇算法 (Online Baum-Welch algorithm for Hierarchical Imitation Learning) - 专知论文

会员服务 ·

0

Baum-Welch算法 · 学成 · Continuity · 分层强化学习 · Performer ·

2021 年 3 月 22 日

Online Baum-Welch algorithm for Hierarchical Imitation Learning

翻译：高级模拟学习在线鲍姆-韦尔奇算法

Vittorio Giammarino,Ioannis Ch. Paschalidis

from arxiv, 15 pages, 2 figures

The options framework for hierarchical reinforcement learning has increased its popularity in recent years and has made improvements in tackling the scalability problem in reinforcement learning. Yet, most of these recent successes are linked with a proper options initialization or discovery. When an expert is available, the options discovery problem can be addressed by learning an options-type hierarchical policy directly from expert demonstrations. This problem is referred to as hierarchical imitation learning and can be handled as an inference problem in a Hidden Markov Model, which is done via an Expectation-Maximization type algorithm. In this work, we propose a novel online algorithm to perform hierarchical imitation learning in the options framework. Further, we discuss the benefits of such an algorithm and compare it with its batch version in classical reinforcement learning benchmarks. We show that this approach works well in both discrete and continuous environments and, under certain conditions, it outperforms the batch version.

翻译：强化等级学习的选项框架近年来越来越受欢迎,并在解决强化等级学习的可扩缩性问题方面有所改进。然而,最近这些成功大多与适当的选项初始化或发现相关。当专家具备时,发现选项的问题可以通过直接从专家演示中学习选项类型等级政策来解决。这个问题被称为等级仿造学习,并可在隐藏的Markov模型中作为一个推论问题处理,该模型是通过期望-最大化类型算法完成的。在这项工作中,我们提出一种新的在线算法,以便在选项框架中进行等级仿制学习。此外,我们讨论这种算法的好处,并将其与经典强化学习基准中的批次版本进行比较。我们表明,这种方法在离散和连续的环境中运作良好,在某些条件下,它比分批版本更完善。

0

相关内容

Baum-Welch算法

Baum-Welch算法

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

专知会员服务

30+阅读 · 2019年12月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Efficient PAC Reinforcement Learning in Regular Decision Processes

Arxiv

0+阅读 · 2021年5月14日

Ordering-Based Causal Discovery with Reinforcement Learning

Arxiv

0+阅读 · 2021年5月14日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Arxiv

8+阅读 · 2020年4月13日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

Hierarchical Meta Learning

Arxiv

9+阅读 · 2019年4月19日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Hierarchical Deep Multiagent Reinforcement Learning

Hierarchical Deep Multiagent Reinforcement Learning

Arxiv

8+阅读 · 2018年9月25日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

Baum-Welch算法

分层强化学习

相关VIP内容

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

专知会员服务

30+阅读 · 2019年12月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Efficient PAC Reinforcement Learning in Regular Decision Processes

Arxiv

0+阅读 · 2021年5月14日

Ordering-Based Causal Discovery with Reinforcement Learning

Arxiv

0+阅读 · 2021年5月14日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Arxiv

8+阅读 · 2020年4月13日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

Hierarchical Meta Learning

Arxiv

9+阅读 · 2019年4月19日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Hierarchical Deep Multiagent Reinforcement Learning

Hierarchical Deep Multiagent Reinforcement Learning

Arxiv

8+阅读 · 2018年9月25日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

微信扫码咨询专知VIP会员