连接物理-内建神经网络与强化学习:汉密尔顿-贾科比-贝尔曼最佳政策优化(HJBPPPO) (Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO)) - 专知论文

会员服务 ·

0

Networking · 优化器 · Learning · Neural Networks · 泛函 ·

2023 年 2 月 1 日

Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO)

翻译：连接物理-内建神经网络与强化学习:汉密尔顿-贾科比-贝尔曼最佳政策优化(HJBPPPO)

Amartya Mukherjee,Jun Liu

This paper introduces the Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO) algorithm into reinforcement learning. The Hamilton-Jacobi-Bellman (HJB) equation is used in control theory to evaluate the optimality of the value function. Our work combines the HJB equation with reinforcement learning in continuous state and action spaces to improve the training of the value network. We treat the value network as a Physics-Informed Neural Network (PINN) to solve for the HJB equation by computing its derivatives with respect to its inputs exactly. The Proximal Policy Optimization (PPO)-Clipped algorithm is improvised with this implementation as it uses a value network to compute the objective function for its policy network. The HJBPPO algorithm shows an improved performance compared to PPO on the MuJoCo environments.

翻译：本文介绍汉密尔顿- 贾科比- 贝勒曼最佳政策(HJBPPO)算法作为强化学习的一部分。汉密尔顿- 贾科比- 贝尔曼(HJB) 等式用于控制理论以评价价值函数的最佳性。我们的工作将HJB等式与连续状态的强化学习以及改善价值网络培训的行动空间结合起来。我们把价值网络视为物理化神经网络(PINN),通过精确计算其投入的衍生物来解析HJB等式。 Proximal政策优化(PPPO)- Clipped 算法与这一执行结合,因为它使用一个价值网络来计算其政策网络的目标功能。 HJBPPO算法显示,与PPO相比, MuJoCo环境的性能有所改善。

0

相关内容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【AAAI2020-Tutorial】可微分图深度学习以及应用，康奈尔大学，Differential Deep Learning on Graphs and its Applications

【AAAI2020-Tutorial】可微分图深度学习以及应用，康奈尔大学，Differential Deep Learning on Graphs and its Applications

专知会员服务

32+阅读 · 2020年2月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

图与推荐

3+阅读 · 2022年9月20日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

广义Lorenz系统族解的有界性研究

国家自然科学基金

0+阅读 · 2015年12月31日

幂零流形上的热核分析与应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于环境交互的软件近似正确性量化模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

Hamilton-Jacobi方程的弱KAM理论

国家自然科学基金

0+阅读 · 2013年12月31日

Lie群和Lie代数方法在可积系统中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

FRP加固钢筋混凝土框架填充墙结构的整体抗震性能试验、分析与设计方法

国家自然科学基金

0+阅读 · 2012年12月31日

二阶逻辑的表达能力与计算复杂性

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton-Jacobi方程的粘性解

国家自然科学基金

0+阅读 · 2009年12月31日

Trajectory Optimization on Matrix Lie Groups with Differential Dynamic Programming and Nonlinear Constraints

Arxiv

0+阅读 · 2023年3月24日

Optimal Transport for Offline Imitation Learning

Arxiv

0+阅读 · 2023年3月24日

Dynamic Combinatorial Assignment

Arxiv

0+阅读 · 2023年3月24日

RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research

Arxiv

0+阅读 · 2023年3月23日

EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph Learning

Arxiv

0+阅读 · 2023年3月22日

Unravelling the Performance of Physics-informed Graph Neural Networks for Dynamical Systems

Arxiv

13+阅读 · 2022年11月10日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【AAAI2020-Tutorial】可微分图深度学习以及应用，康奈尔大学，Differential Deep Learning on Graphs and its Applications

【AAAI2020-Tutorial】可微分图深度学习以及应用，康奈尔大学，Differential Deep Learning on Graphs and its Applications

专知会员服务

32+阅读 · 2020年2月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

图与推荐

3+阅读 · 2022年9月20日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Trajectory Optimization on Matrix Lie Groups with Differential Dynamic Programming and Nonlinear Constraints

Arxiv

0+阅读 · 2023年3月24日

Optimal Transport for Offline Imitation Learning

Arxiv

0+阅读 · 2023年3月24日

Dynamic Combinatorial Assignment

Arxiv

0+阅读 · 2023年3月24日

RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research

Arxiv

0+阅读 · 2023年3月23日

EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph Learning

Arxiv

0+阅读 · 2023年3月22日

Unravelling the Performance of Physics-informed Graph Neural Networks for Dynamical Systems

Arxiv

13+阅读 · 2022年11月10日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

广义Lorenz系统族解的有界性研究

国家自然科学基金

0+阅读 · 2015年12月31日

幂零流形上的热核分析与应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于环境交互的软件近似正确性量化模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

Hamilton-Jacobi方程的弱KAM理论

国家自然科学基金

0+阅读 · 2013年12月31日

Lie群和Lie代数方法在可积系统中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

FRP加固钢筋混凝土框架填充墙结构的整体抗震性能试验、分析与设计方法

国家自然科学基金

0+阅读 · 2012年12月31日

二阶逻辑的表达能力与计算复杂性

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton-Jacobi方程的粘性解

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员