作为非线性优化控制反馈政策的神经数据作为非线性优化控制反馈政策 (Neural ODEs as Feedback Policies for Nonlinear Optimal Control) - 专知论文

会员服务 ·

0

控制器 · Continuity · 优化器 · MoDELS · Analysis ·

2022 年 10 月 20 日

Neural ODEs as Feedback Policies for Nonlinear Optimal Control

翻译：作为非线性优化控制反馈政策的神经数据作为非线性优化控制反馈政策

Ilya Orson Sandoval,Panagiotis Petsagkourakis,Ehecatl Antonio del Rio-Chanona

from arxiv, 9 pages, 5 figures

Neural ordinary differential equations (Neural ODEs) model continuous time dynamics as differential equations parametrized with neural networks. Thanks to their modeling flexibility, they have been adopted for multiple tasks where the continuous time nature of the process is specially relevant, as in system identification and time series analysis. When applied in a control setting, it is possible to adapt their use to approximate optimal nonlinear feedback policies. This formulation follows the same approach as policy gradients in reinforcement learning, covering the case where the environment consists of known deterministic dynamics given by a system of differential equations. The white box nature of the model specification allows the direct calculation of policy gradients through sensitivity analysis, avoiding the inexact and inefficient gradient estimation through sampling. In this work we propose the use of a neural control policy posed as a Neural ODE to solve general nonlinear optimal control problems while satisfying both state and control constraints, which are crucial for real world scenarios. Since the state feedback policy partially modifies the model dynamics, the whole space phase of the system is reshaped upon the optimization. This approach is a sensible approximation to the historically intractable closed loop solution of nonlinear control problems that efficiently exploits the availability of a dynamical system model.

翻译：神经普通差异方程式( Neural CODEs) 模型连续时间动态, 与神经网络相匹配的差别方程式。由于它们具有模型灵活性, 它们被采用用于多个任务, 过程的连续时间性质特别相关, 如系统识别和时间序列分析。当在控制环境中应用时, 可以将其使用调整为最接近最佳的非线性反馈政策。这种配方采用与强化学习政策梯度相同的方法, 包括环境由差异方程式系统提供的已知的确定性动态构成的情况。模型规格的白箱性质允许通过敏感度分析直接计算政策梯度, 避免不精确和低效率的梯度估计。在这项工作中, 我们提议使用神经控制政策作为神经运行模型, 解决一般的非线性最佳控制问题, 同时满足对真实世界情景至关重要的状态和控制制约。由于国家反馈政策部分修改了模型动态, 系统的整个空间阶段在优化后被重新组合。这种方法是对非线性动态控制问题的历史坚固的封闭循环解决方案进行明智的近近近。

0

相关内容

控制器

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

自发性脊髓再生中MIF的非炎性功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

系统性红斑狼疮血脑屏障损伤的动态增强磁共振检测及分子免疫机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

根际促生菌Bacillus amyloliquefaciens SQR9与植物根系分泌物互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

α-SN与miR-138在脊髓横断伤大鼠大脑运动皮质可塑性中的作用及miR-138对α-SN的调控

国家自然科学基金

0+阅读 · 2011年12月31日

代数曲线在序列中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

量子点－菁染料FRET特异性标记荧光探针制备及与癌细胞动态作用过程研究

国家自然科学基金

0+阅读 · 2010年12月31日

PSPC: Efficient Parallel Shortest Path Counting on Large-Scale Graphs

Arxiv

0+阅读 · 2022年12月2日

Decision Market Based Learning For Multi-agent Contextual Bandit Problems

Arxiv

0+阅读 · 2022年12月1日

Learning Agile Paths from Optimal Control

Arxiv

0+阅读 · 2022年11月30日

Efficient variational approximations for state space models

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning Through Trajectory Generation

Arxiv

0+阅读 · 2022年11月30日

Nonlinear Monte Carlo Method for Imbalanced Data Learning

Arxiv

0+阅读 · 2022年11月30日

ARC -- Actor Residual Critic for Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Arxiv

0+阅读 · 2022年11月30日

Kernelization of Discrete Optimization Problems on Parallel Architectures

Arxiv

0+阅读 · 2022年11月22日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

101+阅读 · 2022年5月11日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

PSPC: Efficient Parallel Shortest Path Counting on Large-Scale Graphs

Arxiv

0+阅读 · 2022年12月2日

Decision Market Based Learning For Multi-agent Contextual Bandit Problems

Arxiv

0+阅读 · 2022年12月1日

Learning Agile Paths from Optimal Control

Arxiv

0+阅读 · 2022年11月30日

Efficient variational approximations for state space models

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning Through Trajectory Generation

Arxiv

0+阅读 · 2022年11月30日

Nonlinear Monte Carlo Method for Imbalanced Data Learning

Arxiv

0+阅读 · 2022年11月30日

ARC -- Actor Residual Critic for Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Arxiv

0+阅读 · 2022年11月30日

Kernelization of Discrete Optimization Problems on Parallel Architectures

Arxiv

0+阅读 · 2022年11月22日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

101+阅读 · 2022年5月11日

相关基金

自发性脊髓再生中MIF的非炎性功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

系统性红斑狼疮血脑屏障损伤的动态增强磁共振检测及分子免疫机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

根际促生菌Bacillus amyloliquefaciens SQR9与植物根系分泌物互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

α-SN与miR-138在脊髓横断伤大鼠大脑运动皮质可塑性中的作用及miR-138对α-SN的调控

国家自然科学基金

0+阅读 · 2011年12月31日

代数曲线在序列中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

量子点－菁染料FRET特异性标记荧光探针制备及与癌细胞动态作用过程研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员