与学习不规则调整的次最佳主计长进行价值引导探索</s> (Value Guided Exploration with Sub-optimal Controllers for Learning Dexterous Manipulation) - 专知论文

会员服务 ·

0

Learning · 控制器 · AIM · 知识 (knowledge) · 优化器 ·

2023 年 3 月 6 日

Value Guided Exploration with Sub-optimal Controllers for Learning Dexterous Manipulation

翻译：与学习不规则调整的次最佳主计长进行价值引导探索

Gagan Khandate,Cameron Mehlman,Xingsheng Wei,Matei Ciocarlie

from arxiv, 7 pages, 6 figures, submitted to International Conference on Intelligent Robots & Systems 2023

Recently, reinforcement learning has allowed dexterous manipulation skills with increasing complexity. Nonetheless, learning these skills in simulation still exhibits poor sample-efficiency which stems from the fact these skills are learned from scratch without the benefit of any domain expertise. In this work, we aim to improve the sample-efficiency of learning dexterous in-hand manipulation skills using sub-optimal controllers available via domain knowledge. Our framework optimally queries the sub-optimal controllers and guides exploration toward state-space relevant to the task thereby demonstrating improved sample complexity. We show that our framework allows learning from highly sub-optimal controllers and we are the first to demonstrate learning hard-to-explore finger-gaiting in-hand manipulation skills without the use of an exploratory reset distribution.

翻译：近来,强化学习使得灵活操作技能变得日益复杂。尽管如此,在模拟中学习这些技能仍然显示,由于这些技能是从零到零学习而没有利用任何领域的专门知识,因此样本效率低下。在这项工作中,我们的目标是利用通过域知识提供的亚最佳控制器提高学习灵活操作技能的样本效率。我们的框架最理想地询问次最佳控制器,并指导探索到与任务相关的州空间,从而显示样本复杂性的提高。我们显示,我们的框架允许从高度次最佳控制器中学习,我们是第一个在不使用探索性重置分布的情况下学习难以挖掘的手动手指操纵技能的样本。</s>

0

相关内容

Learning

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

专知会员服务

29+阅读 · 2020年5月19日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

APN转染BMSCs抑制TSP-1/TGF-β1介导的糖尿病心肌纤维化的研究

国家自然科学基金

0+阅读 · 2014年12月31日

电力设备tanδ在线监测中的信号去噪

国家自然科学基金

0+阅读 · 2013年12月31日

基于融合先验知识的机器学习的多传感器融合研究

国家自然科学基金

16+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

空间三维可视化测试技术中的映射混淆机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

汽车实时嵌入式系统中的软件健康监控技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于“萤火虫同步”的无线传感网络分布式时间同步技术

国家自然科学基金

0+阅读 · 2012年12月31日

油藏多孔介质渗流与Stokes流耦合问题数值方法

国家自然科学基金

0+阅读 · 2011年12月31日

Wnt受体LRP5/6抑制肿瘤细胞转移的研究

国家自然科学基金

0+阅读 · 2011年12月31日

承压破碎岩体蠕变-渗流系统非线性动力学特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

Arxiv

0+阅读 · 2023年4月27日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

A Best-of-Both-Worlds Algorithm for Constrained MDPs with Long-Term Constraints

Arxiv

0+阅读 · 2023年4月27日

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

Arxiv

0+阅读 · 2023年4月27日

Deep Imitation Learning for Automated Drop-In Gamma Probe Manipulation

Arxiv

0+阅读 · 2023年4月27日

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

Arxiv

0+阅读 · 2023年4月26日

SEAL: Simultaneous Label Hierarchy Exploration And Learning

Arxiv

0+阅读 · 2023年4月26日

Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor

Arxiv

0+阅读 · 2023年4月26日

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

Arxiv

0+阅读 · 2023年4月23日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

专知会员服务

29+阅读 · 2020年5月19日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

三维高斯泼溅应用综述：分割、编辑与生成

《多智能体不确定环境追逃博弈研究》216页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

Arxiv

0+阅读 · 2023年4月27日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

A Best-of-Both-Worlds Algorithm for Constrained MDPs with Long-Term Constraints

Arxiv

0+阅读 · 2023年4月27日

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

Arxiv

0+阅读 · 2023年4月27日

Deep Imitation Learning for Automated Drop-In Gamma Probe Manipulation

Arxiv

0+阅读 · 2023年4月27日

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

Arxiv

0+阅读 · 2023年4月26日

SEAL: Simultaneous Label Hierarchy Exploration And Learning

Arxiv

0+阅读 · 2023年4月26日

Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor

Arxiv

0+阅读 · 2023年4月26日

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

Arxiv

0+阅读 · 2023年4月23日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

相关基金

APN转染BMSCs抑制TSP-1/TGF-β1介导的糖尿病心肌纤维化的研究

国家自然科学基金

0+阅读 · 2014年12月31日

电力设备tanδ在线监测中的信号去噪

国家自然科学基金

0+阅读 · 2013年12月31日

基于融合先验知识的机器学习的多传感器融合研究

国家自然科学基金

16+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

空间三维可视化测试技术中的映射混淆机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

汽车实时嵌入式系统中的软件健康监控技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于“萤火虫同步”的无线传感网络分布式时间同步技术

国家自然科学基金

0+阅读 · 2012年12月31日

油藏多孔介质渗流与Stokes流耦合问题数值方法

国家自然科学基金

0+阅读 · 2011年12月31日

Wnt受体LRP5/6抑制肿瘤细胞转移的研究

国家自然科学基金

0+阅读 · 2011年12月31日

承压破碎岩体蠕变-渗流系统非线性动力学特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员