POMDPs 的强力非对称学习 (Robust Asymmetric Learning in POMDPs) - 专知论文

会员服务 ·

0

学成 · 部分可观测马尔可夫决策过程 · 次最优 · 稳健性 · Processing（编程语言） ·

2021 年 3 月 19 日

Robust Asymmetric Learning in POMDPs

翻译：POMDPs 的强力非对称学习

Andrew Warrington,J. Wilder Lavington,Adam Scibior,Mark Schmidt,Frank Wood

Policies for partially observed Markov decision processes can be efficiently learned by imitating policies for the corresponding fully observed Markov decision processes. Unfortunately, existing approaches for this kind of imitation learning have a serious flaw: the expert does not know what the trainee cannot see, and so may encourage actions that are sub-optimal, even unsafe, under partial information. We derive an objective to instead train the expert to maximize the expected reward of the imitating agent policy, and use it to construct an efficient algorithm, adaptive asymmetric DAgger (A2D), that jointly trains the expert and the agent. We show that A2D produces an expert policy that the agent can safely imitate, in turn outperforming policies learned by imitating a fixed expert.

翻译：部分观测到的Markov决策程序的政策可以通过模仿相应的充分观察的Markov决策程序的政策来有效学习。不幸的是,目前这种模仿学习的方法存在严重的缺陷:专家不知道受训人看不到什么,因此鼓励一些行动,这些行动不够理想,甚至是不安全的,只是部分信息。我们的目标是培训专家尽量扩大模仿剂政策的预期回报,并利用它构建一种高效算法,即适应性不对称的达格(A2D),联合培训专家和代理人。我们证明A2D产生了一种专家政策,该代理人可以安全地模仿,从而产生模仿固定专家所学到的绩效政策。

0

相关内容

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

专知会员服务

192+阅读 · 2020年5月30日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

176+阅读 · 2020年5月10日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

专知会员服务

56+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Cross-Modal & Metric Learning 跨模态检索专题-2

Cross-Modal & Metric Learning 跨模态检索专题-2

AINLP

5+阅读 · 2020年5月21日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Robust Graph Learning Under Wasserstein Uncertainty

Arxiv

0+阅读 · 2021年5月13日

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

0+阅读 · 2021年5月6日

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Arxiv

1+阅读 · 2021年5月4日

Learning Multi-level Weight-centric Features for Few-shot Learning

Arxiv

0+阅读 · 2021年5月4日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Hardness-Aware Deep Metric Learning

Hardness-Aware Deep Metric Learning

Arxiv

6+阅读 · 2019年3月13日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Deep Randomized Ensembles for Metric Learning

Deep Randomized Ensembles for Metric Learning

Arxiv

5+阅读 · 2018年9月4日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Online Deep Metric Learning

Arxiv

8+阅读 · 2018年5月15日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

Processing（编程语言）

相关VIP内容

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

专知会员服务

192+阅读 · 2020年5月30日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

176+阅读 · 2020年5月10日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

专知会员服务

56+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

Cross-Modal & Metric Learning 跨模态检索专题-2

Cross-Modal & Metric Learning 跨模态检索专题-2

AINLP

5+阅读 · 2020年5月21日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Robust Graph Learning Under Wasserstein Uncertainty

Arxiv

0+阅读 · 2021年5月13日

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

0+阅读 · 2021年5月6日

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Arxiv

1+阅读 · 2021年5月4日

Learning Multi-level Weight-centric Features for Few-shot Learning

Arxiv

0+阅读 · 2021年5月4日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Hardness-Aware Deep Metric Learning

Hardness-Aware Deep Metric Learning

Arxiv

6+阅读 · 2019年3月13日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Deep Randomized Ensembles for Metric Learning

Deep Randomized Ensembles for Metric Learning

Arxiv

5+阅读 · 2018年9月4日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Online Deep Metric Learning

Arxiv

8+阅读 · 2018年5月15日

微信扫码咨询专知VIP会员