利用确定性抽样和合用同一地点直接优化政策 (Direct Policy Optimization using Deterministic Sampling and Collocation) - 专知论文

会员服务 ·

0

有向 · 优化器 · 最优化 · 样本 · 拟牛顿法 ·

2023 年 1 月 11 日

Direct Policy Optimization using Deterministic Sampling and Collocation

翻译：利用确定性抽样和合用同一地点直接优化政策

Taylor A. Howell,Chunjiang Fu,Zachary Manchester

from arxiv, Minor fixes

We present an approach for approximately solving discrete-time stochastic optimal-control problems by combining direct trajectory optimization, deterministic sampling, and policy optimization. Our feedback motion-planning algorithm uses a quasi-Newton method to simultaneously optimize a reference trajectory, a set of deterministically chosen sample trajectories, and a parameterized policy. We demonstrate that this approach exactly recovers LQR policies in the case of linear dynamics, quadratic objective, and Gaussian disturbances. We also demonstrate the algorithm on several nonlinear, underactuated robotic systems to highlight its performance and ability to handle control limits, safely avoid obstacles, and generate robust plans in the presence of unmodeled dynamics.

翻译：我们提出了一个方法,通过将直接轨迹优化、确定性抽样和政策优化结合起来,解决离散时间随机最佳控制问题。我们的反馈运动规划算法使用准牛顿方法,同时优化参考轨迹、一组确定性选择的样本轨迹和参数化政策。我们证明,这种方法完全恢复了线性动态、四轨目标和高斯扰动情况下的LQR政策。我们还展示了几个非线性、低活性机器人系统的算法,以突出其处理控制限度的性能和能力,安全避免障碍,并在非模型化动态下制定强有力的计划。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于QoP与QoS融合的通信网络系统优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

碳硅化钛/铝基自润滑复合材料界面调控及摩擦学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1 Tat蛋白诱导血脑屏障破坏及其作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于化学反应飞秒相干控制的飞秒时间分辨相干Raman光谱仪的研制

国家自然科学基金

0+阅读 · 2011年12月31日

宽带放大器用Er3+/Ce3+共掺碲酸盐玻璃及光纤1.53μm波段辐射强度提高研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于细胞凋亡抑制途径的酵母耐铝性及其胞内钙信号调控分子机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learning Environment-Aware Control Barrier Functions for Safe and Feasible Multi-Robot Navigation

Arxiv

0+阅读 · 2023年3月8日

New Perspectives on Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年3月7日

Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation

Arxiv

0+阅读 · 2023年3月6日

Improved Exploration for Safety-Embedded Differential Dynamic Programming Using Tolerant Barrier States

Arxiv

0+阅读 · 2023年3月6日

Local Environment Poisoning Attacks on Federated Reinforcement Learning

Arxiv

0+阅读 · 2023年3月5日

Optimizing Low Dimensional Functions over the Integers

Arxiv

0+阅读 · 2023年3月4日

Agent-based Collaborative Random Search for Hyper-parameter Tuning and Global Function Optimization

Arxiv

0+阅读 · 2023年3月3日

Deterministic training of generative autoencoders using invertible layers

Arxiv

0+阅读 · 2023年3月3日

Approximating Energy Market Clearing and Bidding With Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年3月3日

Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems

Arxiv

0+阅读 · 2023年3月2日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Learning Environment-Aware Control Barrier Functions for Safe and Feasible Multi-Robot Navigation

Arxiv

0+阅读 · 2023年3月8日

New Perspectives on Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年3月7日

Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation

Arxiv

0+阅读 · 2023年3月6日

Improved Exploration for Safety-Embedded Differential Dynamic Programming Using Tolerant Barrier States

Arxiv

0+阅读 · 2023年3月6日

Local Environment Poisoning Attacks on Federated Reinforcement Learning

Arxiv

0+阅读 · 2023年3月5日

Optimizing Low Dimensional Functions over the Integers

Arxiv

0+阅读 · 2023年3月4日

Agent-based Collaborative Random Search for Hyper-parameter Tuning and Global Function Optimization

Arxiv

0+阅读 · 2023年3月3日

Deterministic training of generative autoencoders using invertible layers

Arxiv

0+阅读 · 2023年3月3日

Approximating Energy Market Clearing and Bidding With Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年3月3日

Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems

Arxiv

0+阅读 · 2023年3月2日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于QoP与QoS融合的通信网络系统优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

碳硅化钛/铝基自润滑复合材料界面调控及摩擦学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1 Tat蛋白诱导血脑屏障破坏及其作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于化学反应飞秒相干控制的飞秒时间分辨相干Raman光谱仪的研制

国家自然科学基金

0+阅读 · 2011年12月31日

宽带放大器用Er3+/Ce3+共掺碲酸盐玻璃及光纤1.53μm波段辐射强度提高研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于细胞凋亡抑制途径的酵母耐铝性及其胞内钙信号调控分子机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员