推动轨道优化和政策学习之间在精确机器人控制方面的共识 (Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control) - 专知论文

会员服务 ·

0

Learning · 控制器 · 优化器 · INFORMS · 查准率/准确率 ·

2023 年 2 月 16 日

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control

翻译：推动轨道优化和政策学习之间在精确机器人控制方面的共识

Quentin Le Lidec,Wilson Jallet,Ivan Laptev,Cordelia Schmid,Justin Carpentier

Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly converge towards a locally optimal control trajectory which is only valid within the vicinity of the solution. Over the past decade, several approaches have aimed to adequately combine the two classes of methods in order to obtain the best of both worlds. Following on from this line of research, we propose several improvements on top of these approaches to learn global control policies quicker, notably by leveraging sensitivity information stemming from TO methods via Sobolev learning, and augmented Lagrangian techniques to enforce the consensus between TO and policy learning. We evaluate the benefits of these improvements on various classical tasks in robotics through comparison with existing approaches in the literature.

翻译：强化学习(RL)和轨迹优化(TO)具有巨大的互补优势。一方面,RL方法能够直接从数据中学习全球控制政策,但通常需要大量样本规模才能适当地趋向可行的政策。另一方面,方法能够利用模拟器提取的梯度信息,迅速走向地方最佳控制轨迹,该轨迹仅在解决方案附近有效。在过去10年中,若干方法旨在充分结合这两类方法,以便获得两个世界的最佳结果。从这一研究轨迹来看,我们建议对这些方法进行一些改进,以便更快地学习全球控制政策,特别是利用通过索博列夫学习的方法产生的敏感信息,并增强拉格朗加技术,以落实学习与政策之间的共识。我们通过比较文献中的现有方法,评估这些改进对机器人中各种传统任务的好处。

0

相关内容

Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

HuR/lincRNA152复合物在脓毒症免疫抑制中的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

几何约束视角下异构群体队形光滑变换控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于海洋要素场的涡旋过程数据建模与可视化

国家自然科学基金

2+阅读 · 2012年12月31日

移动机器人基于三维激光测距的室内场景认知与物体识别

国家自然科学基金

0+阅读 · 2012年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

用荧光磁性纳米探针筛选中草药中的小分子酶抑制剂

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

超大规模集成电路多目标划分的算法研究

国家自然科学基金

2+阅读 · 2010年12月31日

分数布朗运动环境下金融保险中优化问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Visual Active Search Framework for Geospatial Exploration

Arxiv

0+阅读 · 2023年4月5日

A Policy-Guided Imitation Approach for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年4月5日

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月4日

Learning robotic milling strategies based on passive variable operational space interaction control

Arxiv

0+阅读 · 2023年4月3日

Action Pick-up in Dynamic Action Space Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics

Arxiv

0+阅读 · 2023年4月3日

User-Conditioned Neural Control Policies for Mobile Robotics

Arxiv

0+阅读 · 2023年4月2日

Convergent iLQR for Safe Trajectory Planning and Control of Legged Robots

Arxiv

0+阅读 · 2023年4月1日

Factorization of Multi-Agent Sampling-Based Motion Planning

Arxiv

0+阅读 · 2023年4月1日

A Survey on Trajectory Data Management, Analytics, and Learning

A Survey on Trajectory Data Management, Analytics, and Learning

Arxiv

16+阅读 · 2020年3月25日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A Visual Active Search Framework for Geospatial Exploration

Arxiv

0+阅读 · 2023年4月5日

A Policy-Guided Imitation Approach for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年4月5日

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月4日

Learning robotic milling strategies based on passive variable operational space interaction control

Arxiv

0+阅读 · 2023年4月3日

Action Pick-up in Dynamic Action Space Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics

Arxiv

0+阅读 · 2023年4月3日

User-Conditioned Neural Control Policies for Mobile Robotics

Arxiv

0+阅读 · 2023年4月2日

Convergent iLQR for Safe Trajectory Planning and Control of Legged Robots

Arxiv

0+阅读 · 2023年4月1日

Factorization of Multi-Agent Sampling-Based Motion Planning

Arxiv

0+阅读 · 2023年4月1日

A Survey on Trajectory Data Management, Analytics, and Learning

A Survey on Trajectory Data Management, Analytics, and Learning

Arxiv

16+阅读 · 2020年3月25日

相关基金

HuR/lincRNA152复合物在脓毒症免疫抑制中的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

几何约束视角下异构群体队形光滑变换控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于海洋要素场的涡旋过程数据建模与可视化

国家自然科学基金

2+阅读 · 2012年12月31日

移动机器人基于三维激光测距的室内场景认知与物体识别

国家自然科学基金

0+阅读 · 2012年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

用荧光磁性纳米探针筛选中草药中的小分子酶抑制剂

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

超大规模集成电路多目标划分的算法研究

国家自然科学基金

2+阅读 · 2010年12月31日

分数布朗运动环境下金融保险中优化问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员