机器人学大型行动空间行动前行动 (Action Priors for Large Action Spaces in Robotics) - 专知论文

会员服务 ·

0

学成 · 机器人 · 强化学习 · 情景 · 塑造 ·

2021 年 1 月 11 日

Action Priors for Large Action Spaces in Robotics

翻译：机器人学大型行动空间行动前行动

Ondrej Biza,Dian Wang,Robert Platt,Jan-Willem van de Meent,Lawson L. S. Wong

from arxiv, 12 pages, 9 figures

In robotics, it is often not possible to learn useful policies using pure model-free reinforcement learning without significant reward shaping or curriculum learning. As a consequence, many researchers rely on expert demonstrations to guide learning. However, acquiring expert demonstrations can be expensive. This paper proposes an alternative approach where the solutions of previously solved tasks are used to produce an action prior that can facilitate exploration in future tasks. The action prior is a probability distribution over actions that summarizes the set of policies found solving previous tasks. Our results indicate that this approach can be used to solve robotic manipulation problems that would otherwise be infeasible without expert demonstrations.

翻译：在机器人方面,往往不可能在不进行重大奖励制成或课程学习的情况下,学习使用纯粹无模式强化学习的有益政策;因此,许多研究人员依靠专家示范来指导学习;然而,获得专家示范可能费用昂贵;本文件提出另一种办法,即利用以前解决过的任务的解决方案来产生行动,从而便利今后任务的探索;先前的行动是,对总结解决以往任务的一系列政策的行动进行概率分配;我们的结果显示,这种方法可用于解决机器人操纵问题,而如果没有专家示范,这些问题本来是行不通的。

0

相关内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【O'Reilly AI Conference 2019】AI成长之路：使用指南（For AI to thrive, failure is necessary: A practical guide (sponsored by IBM Watson)),IBM Ritika Gunnar

【O'Reilly AI Conference 2019】AI成长之路：使用指南（For AI to thrive, failure is necessary: A practical guide (sponsored by IBM Watson)),IBM Ritika Gunnar

专知会员服务

11+阅读 · 2019年11月5日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Arxiv

0+阅读 · 2021年3月8日

Rapidly-exploring Random Forest: Adaptively Exploits Local Structure with Generalised Multi-Trees Motion Planning

Arxiv

0+阅读 · 2021年3月7日

Causal Inference With Selectively Deconfounded Data

Arxiv

0+阅读 · 2021年3月7日

Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms

Arxiv

0+阅读 · 2021年3月5日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月6日

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arxiv

3+阅读 · 2018年7月16日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【O'Reilly AI Conference 2019】AI成长之路：使用指南（For AI to thrive, failure is necessary: A practical guide (sponsored by IBM Watson)),IBM Ritika Gunnar

【O'Reilly AI Conference 2019】AI成长之路：使用指南（For AI to thrive, failure is necessary: A practical guide (sponsored by IBM Watson)),IBM Ritika Gunnar

专知会员服务

11+阅读 · 2019年11月5日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向向量的机器学习系统：跨栈方法

【ICCV2025】SO(3) 上连续非保守动力系统的预测

【ICCV2025】Lay2Story：扩展扩散 Transformer 以实现可切换布局的故事生成

人工智能治理全景综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Arxiv

0+阅读 · 2021年3月8日

Rapidly-exploring Random Forest: Adaptively Exploits Local Structure with Generalised Multi-Trees Motion Planning

Arxiv

0+阅读 · 2021年3月7日

Causal Inference With Selectively Deconfounded Data

Arxiv

0+阅读 · 2021年3月7日

Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms

Arxiv

0+阅读 · 2021年3月5日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月6日

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arxiv

3+阅读 · 2018年7月16日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员