在线直接政策优化:比较研究和神经网络最佳反馈控制的统一培训范例 (Offline Supervised Learning V.S. Online Direct Policy Optimization: A Comparative Study and A Unified Training Paradigm for Neural Network-Based Optimal Feedback Control) - 专知论文

会员服务 ·

0

优化器 · 控制器 · 有向 · 最优化 · Learning ·

2022 年 11 月 29 日

Offline Supervised Learning V.S. Online Direct Policy Optimization: A Comparative Study and A Unified Training Paradigm for Neural Network-Based Optimal Feedback Control

翻译：在线直接政策优化:比较研究和神经网络最佳反馈控制的统一培训范例

Yue Zhao,Jiequn Han

This work is concerned with solving neural network-based feedback controllers efficiently for optimal control problems. We first conduct a comparative study of two mainstream approaches: offline supervised learning and online direct policy optimization. Albeit the training part of the supervised learning approach is relatively easy, the success of the method heavily depends on the optimal control dataset generated by open-loop optimal control solvers. In contrast, direct optimization turns the optimal control problem into an optimization problem directly without any requirement of pre-computing, but the dynamics-related objective can be hard to optimize when the problem is complicated. Our results highlight the priority of offline supervised learning in terms of both optimality and training time. To overcome the main challenges, dataset, and optimization, in the two approaches respectively, we complement them and propose the Pre-train and Fine-tune strategy as a unified training paradigm for optimal feedback control, which further improves the performance and robustness significantly. Our code is available at https://github.com/yzhao98/DeepOptimalControl.

翻译：这项工作涉及高效解决神经网络反馈控制器以优化控制问题。我们首先对两种主流方法进行比较研究:离线监督学习和在线直接政策优化。尽管受监督学习方法的培训部分相对容易,但这种方法的成功在很大程度上取决于开放环最佳控制解答器产生的最佳控制数据集。相比之下,直接优化直接将最佳控制问题转化为最佳控制问题,无需事先计算,但在问题复杂时,动态相关目标可能难以优化。我们的结果突出了离线监督学习在最佳性和培训时间两方面的优先事项。为了克服主要挑战、数据集和优化,我们在两种方法中分别加以补充,并提议将培训前和微调战略作为最佳反馈控制的统一培训模式,以进一步提高业绩和稳健性。我们的代码可在https://github.com/yzhao98/DeepOptimal Contal查阅。

0

相关内容

优化器

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

抑癌基因HOXD10及其启动子甲基化调控前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肽靶向稀土上转换发光纳米材料-BODIPY光动力学治疗系统的制备与应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于SiC衬底的"近自由态"外延石墨烯制备及电输运研究

国家自然科学基金

0+阅读 · 2013年12月31日

Massive MIMO系统关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

奇异摄动问题各向异性自适应有限元

国家自然科学基金

0+阅读 · 2012年12月31日

Bi5Ti3FeO15/CuO异质结薄膜的制备、光伏特性与载流子输运机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ECAP中fcc织构转变规律及其与微观组织演化的交互作用

国家自然科学基金

0+阅读 · 2012年12月31日

Fbxw8对上皮间质转化（EMT）的调控及其在前列腺癌转移中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

PMSA适配子-穿膜肽靶向高效递送系统介导的siRNA抗前列腺癌实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHH信号通路调控人前列腺癌细胞EMT转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback

Arxiv

0+阅读 · 2023年1月30日

Guiding Online Reinforcement Learning with Action-Free Offline Pretraining

Arxiv

0+阅读 · 2023年1月30日

On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Arxiv

0+阅读 · 2023年1月30日

Inequality Constrained Stochastic Nonlinear Optimization via Active-Set Sequential Quadratic Programming

Arxiv

0+阅读 · 2023年1月30日

JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution

Arxiv

0+阅读 · 2023年1月30日

Achieving Risk Control in Online Learning Settings

Arxiv

0+阅读 · 2023年1月27日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback

Arxiv

0+阅读 · 2023年1月30日

Guiding Online Reinforcement Learning with Action-Free Offline Pretraining

Arxiv

0+阅读 · 2023年1月30日

On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Arxiv

0+阅读 · 2023年1月30日

Inequality Constrained Stochastic Nonlinear Optimization via Active-Set Sequential Quadratic Programming

Arxiv

0+阅读 · 2023年1月30日

JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution

Arxiv

0+阅读 · 2023年1月30日

Achieving Risk Control in Online Learning Settings

Arxiv

0+阅读 · 2023年1月27日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

相关基金

抑癌基因HOXD10及其启动子甲基化调控前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肽靶向稀土上转换发光纳米材料-BODIPY光动力学治疗系统的制备与应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于SiC衬底的"近自由态"外延石墨烯制备及电输运研究

国家自然科学基金

0+阅读 · 2013年12月31日

Massive MIMO系统关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

奇异摄动问题各向异性自适应有限元

国家自然科学基金

0+阅读 · 2012年12月31日

Bi5Ti3FeO15/CuO异质结薄膜的制备、光伏特性与载流子输运机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ECAP中fcc织构转变规律及其与微观组织演化的交互作用

国家自然科学基金

0+阅读 · 2012年12月31日

Fbxw8对上皮间质转化（EMT）的调控及其在前列腺癌转移中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

PMSA适配子-穿膜肽靶向高效递送系统介导的siRNA抗前列腺癌实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHH信号通路调控人前列腺癌细胞EMT转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员