机器人操作的鲁棒强化学习中利用实时模拟的内在随机性 (Exploiting Intrinsic Stochasticity of Real-Time Simulation to Facilitate Robust Reinforcement Learning for Robot Manipulation) - 专知论文

会员服务 ·

0

实时模拟 · 鲁棒 · 机器人 · 操作 · 启发式 ·

2023 年 4 月 12 日

Exploiting Intrinsic Stochasticity of Real-Time Simulation to Facilitate Robust Reinforcement Learning for Robot Manipulation

翻译：机器人操作的鲁棒强化学习中利用实时模拟的内在随机性

Ram Dershan,Amir M. Soufi Enayati,Zengjie Zhang,Dean Richert,Homayoun Najjaran

Simulation is essential to reinforcement learning (RL) before implementation in the real world, especially for safety-critical applications like robot manipulation. Conventionally, RL agents are sensitive to the discrepancies between the simulation and the real world, known as the sim-to-real gap. The application of domain randomization, a technique used to fill this gap, is limited to the imposition of heuristic-randomized models. We investigate the properties of intrinsic stochasticity of real-time simulation (RT-IS) of off-the-shelf simulation software and its potential to improve the robustness of RL methods and the performance of domain randomization. Firstly, we conduct analytical studies to measure the correlation of RT-IS with the occupation of the computer hardware and validate its comparability with the natural stochasticity of a physical robot. Then, we apply the RT-IS feature in the training of an RL agent. The simulation and physical experiment results verify the feasibility and applicability of RT-IS to robust RL agent design for robot manipulation tasks. The RT-IS-powered robust RL agent outperforms conventional RL agents on robots with modeling uncertainties. It requires fewer heuristic randomization and achieves better generalizability than the conventional domain-randomization-powered agents. Our findings provide a new perspective on the sim-to-real problem in practical applications like robot manipulation tasks.

翻译：仿真在实现机器人操作等安全关键应用之前，对于强化学习（RL）至关重要。传统上，RL代理在模拟和现实世界之间的差异即模拟到真实（sim-to-real）鸿沟很敏感。填补此差距的技术，如域随机化，仅限于对启发式随机化模型的强加。我们研究了现成的仿真软件的实时模拟（RT-IS）的内在随机性特征及其改善RL方法的鲁棒性和域随机化性能的潜力。首先，我们进行了分析研究，测量了RT-IS与计算机硬件的占用之间的关系，并验证了其与物理机器人的自然随机性的可比性。然后，我们将RT-IS方法应用于RL代理的训练。仿真和物理实验结果验证了RT-IS对于机器人操作任务的鲁棒RL代理设计的可行性和适用性。RT-IS强化的鲁棒RL代理在建模不确定性机器人上表现优于传统RL代理。它需要更少的启发式随机化，比传统域随机化强化代理实现更好的通用性。我们的发现为实际应用（如机器人操作任务）中的模拟到现实问题提供了新的视角。

0

相关内容

实时模拟

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

专知会员服务

50+阅读 · 2020年1月3日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

IJCAI2022推荐系统论文集锦

IJCAI2022推荐系统论文集锦

机器学习与推荐算法

0+阅读 · 2022年5月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

电场调控尖晶石型铁氧体薄膜阻变与磁性及其关联特性研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

空间语义地图机器人自主在线构建方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

贵金属—钙钛矿催化剂材料中的金属扩散和催化活性

国家自然科学基金

0+阅读 · 2013年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

HIPIMS制备环境自适应C/MoSx复合润滑薄膜的界面结构调控及摩擦机理

国家自然科学基金

0+阅读 · 2012年12月31日

六方锰氧化物薄膜多铁性能的第一原理设计与研究

国家自然科学基金

0+阅读 · 2012年12月31日

微机电系统的建模、分析与操控

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于集值函数描述的移动机器人自主行为基础问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Subequivariant Graph Reinforcement Learning in 3D Environments

Arxiv

0+阅读 · 2023年5月30日

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年5月29日

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism

Arxiv

0+阅读 · 2023年5月29日

Sequence Modeling is a Robust Contender for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年5月26日

IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality

Arxiv

0+阅读 · 2023年5月26日

Adaptive PD Control using Deep Reinforcement Learning for Local-Remote Teleoperation with Stochastic Time Delays

Arxiv

0+阅读 · 2023年5月26日

A Simulation Environment and Reinforcement Learning Method for Waste Reduction

Arxiv

0+阅读 · 2023年5月26日

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

Arxiv

0+阅读 · 2023年5月25日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

VIP会员

文章信息

相关主题

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

专知会员服务

50+阅读 · 2020年1月3日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

IJCAI2022推荐系统论文集锦

IJCAI2022推荐系统论文集锦

机器学习与推荐算法

0+阅读 · 2022年5月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Subequivariant Graph Reinforcement Learning in 3D Environments

Arxiv

0+阅读 · 2023年5月30日

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年5月29日

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism

Arxiv

0+阅读 · 2023年5月29日

Sequence Modeling is a Robust Contender for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年5月26日

IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality

Arxiv

0+阅读 · 2023年5月26日

Adaptive PD Control using Deep Reinforcement Learning for Local-Remote Teleoperation with Stochastic Time Delays

Arxiv

0+阅读 · 2023年5月26日

A Simulation Environment and Reinforcement Learning Method for Waste Reduction

Arxiv

0+阅读 · 2023年5月26日

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

Arxiv

0+阅读 · 2023年5月25日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

相关基金

电场调控尖晶石型铁氧体薄膜阻变与磁性及其关联特性研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

空间语义地图机器人自主在线构建方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

贵金属—钙钛矿催化剂材料中的金属扩散和催化活性

国家自然科学基金

0+阅读 · 2013年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

HIPIMS制备环境自适应C/MoSx复合润滑薄膜的界面结构调控及摩擦机理

国家自然科学基金

0+阅读 · 2012年12月31日

六方锰氧化物薄膜多铁性能的第一原理设计与研究

国家自然科学基金

0+阅读 · 2012年12月31日

微机电系统的建模、分析与操控

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于集值函数描述的移动机器人自主行为基础问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员