Keyframe 示范种子和贝耶斯最佳政策搜索 (Keyframe Demonstration Seeded and Bayesian Optimized Policy Search) - 专知论文

会员服务 ·

0

策略搜索 · DBN · Learning · 优化器 · 机器人 ·

2023 年 1 月 19 日

Keyframe Demonstration Seeded and Bayesian Optimized Policy Search

翻译：Keyframe 示范种子和贝耶斯最佳政策搜索

Onur Berk Tore,Farzin Negahbani,Baris Akgun

This paper introduces a novel Learning from Demonstration framework to learn robotic skills with keyframe demonstrations using a Dynamic Bayesian Network (DBN) and a Bayesian Optimized Policy Search approach to improve the learned skills. DBN learns the robot motion, perceptual change in the object of interest (aka skill sub-goals) and the relation between them. The rewards are also learned from the perceptual part of the DBN. The policy search part is a semiblack box algorithm, which we call BO-PI2 . It utilizes the action-perception relation to focus the high-level exploration, uses Gaussian Processes to model the expected-return and performs Upper Confidence Bound type low-level exploration for sampling the rollouts. BO-PI2 is compared against a stateof-the-art method on three different skills in a real robot setting with expert and naive user demonstrations. The results show that our approach successfully focuses the exploration on the failed sub-goals and the addition of reward-predictive exploration outperforms the state-of-the-art approach on cumulative reward, skill success, and termination time metrics.

翻译：本文介绍一个创新的示范学习框架,学习机器人技能,使用动态巴伊西亚网络(DBN)和巴伊西亚最佳政策搜索方法,利用关键框架演示学习机器人技能,提高学习技能。DBN学习机器人运动、兴趣对象(aka技能子目标)的观念变化以及两者之间的关系。奖励也从DBN的概念部分中学习。政策搜索部分是一个半黑盒算法,我们称之为BO-PI2。它利用行动-概念关系,将高级别探索的重点放在高斯进程上,利用高斯进程模拟预期回报和进行高信任型低级探索,以抽样展示推出。BO-PI2与在真正机器人环境中由专家进行和天真的用户演示的三种不同技能的先进方法相比较。结果显示,我们的方法成功地将探索重点放在失败的子目标上,并增加了奖赏前探索。它超越了在累积奖励、技能成功和终止时间指标方面采用的最新方法。

0

相关内容

策略搜索

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

分子显像监测TIGAR调节微环境诱导肿瘤转移及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于压缩感知的超燃冲压发动机欠采样试验数据处理方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于非精确计算的高光谱图像目标实时探测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

动态复杂未知环境下的移动机器人实时SLAM算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

单倍体造血干细胞移植模式下NK细胞双重角色的机制探讨

国家自然科学基金

0+阅读 · 2008年12月31日

Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

Arxiv

0+阅读 · 2023年3月12日

NOMU: Neural Optimization-based Model Uncertainty

Arxiv

0+阅读 · 2023年3月11日

SEER: Safe Efficient Exploration for Aerial Robots using Learning to Predict Information Gain

Arxiv

0+阅读 · 2023年3月10日

Model-based Causal Bayesian Optimization

Arxiv

0+阅读 · 2023年3月10日

Training, Architecture, and Prior for Deterministic Uncertainty Methods

Arxiv

0+阅读 · 2023年3月10日

Gaussian Max-Value Entropy Search for Multi-Agent Bayesian Optimization

Arxiv

0+阅读 · 2023年3月10日

On Onboard LiDAR-based Flying Object Detection

On Onboard LiDAR-based Flying Object Detection

Arxiv

0+阅读 · 2023年3月9日

SpyroPose: Importance Sampling Pyramids for Object Pose Distribution Estimation in SE(3)

Arxiv

0+阅读 · 2023年3月9日

Virtual Inverse Perspective Mapping for Simultaneous Pose and Motion Estimation

Arxiv

0+阅读 · 2023年3月9日

Learning Exploration Strategies to Solve Real-World Marble Runs

Arxiv

0+阅读 · 2023年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

Arxiv

0+阅读 · 2023年3月12日

NOMU: Neural Optimization-based Model Uncertainty

Arxiv

0+阅读 · 2023年3月11日

SEER: Safe Efficient Exploration for Aerial Robots using Learning to Predict Information Gain

Arxiv

0+阅读 · 2023年3月10日

Model-based Causal Bayesian Optimization

Arxiv

0+阅读 · 2023年3月10日

Training, Architecture, and Prior for Deterministic Uncertainty Methods

Arxiv

0+阅读 · 2023年3月10日

Gaussian Max-Value Entropy Search for Multi-Agent Bayesian Optimization

Arxiv

0+阅读 · 2023年3月10日

On Onboard LiDAR-based Flying Object Detection

On Onboard LiDAR-based Flying Object Detection

Arxiv

0+阅读 · 2023年3月9日

SpyroPose: Importance Sampling Pyramids for Object Pose Distribution Estimation in SE(3)

Arxiv

0+阅读 · 2023年3月9日

Virtual Inverse Perspective Mapping for Simultaneous Pose and Motion Estimation

Arxiv

0+阅读 · 2023年3月9日

Learning Exploration Strategies to Solve Real-World Marble Runs

Arxiv

0+阅读 · 2023年3月8日

相关基金

分子显像监测TIGAR调节微环境诱导肿瘤转移及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于压缩感知的超燃冲压发动机欠采样试验数据处理方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于非精确计算的高光谱图像目标实时探测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

动态复杂未知环境下的移动机器人实时SLAM算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

单倍体造血干细胞移植模式下NK细胞双重角色的机制探讨

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员