有限地平线Q学习:稳定、趋同、模拟和智能网格应用程序 (Finite Horizon Q-learning: Stability, Convergence, Simulations and an application on Smart Grids) - 专知论文

会员服务 ·

0

情景 · Performer · 全 · Processing（编程语言） · 学成 ·

2022 年 5 月 3 日

Finite Horizon Q-learning: Stability, Convergence, Simulations and an application on Smart Grids

翻译：有限地平线Q学习:稳定、趋同、模拟和智能网格应用程序

Vivek VP,Dr. Shalabh Bhatnagar

Q-learning is a popular reinforcement learning algorithm. This algorithm has however been studied and analysed mainly in the infinite horizon setting. There are several important applications which can be modeled in the framework of finite horizon Markov decision processes. We develop a version of Q-learning algorithm for finite horizon Markov decision processes (MDP) and provide a full proof of its stability and convergence. Our analysis of stability and convergence of finite horizon Q-learning is based entirely on the ordinary differential equations (O.D.E) method. We also demonstrate the performance of our algorithm on a setting of random MDP as well as on an application on smart grids.

翻译：Q-学习是一种受欢迎的强化学习算法,但这一算法主要是在无限的地平线环境中研究和分析的,在有限的地平线马尔科夫决定程序的框架内可以建模若干重要的应用程序。我们为有限的地平线马尔科夫决定程序开发了一套Q-学习算法(MDP),并充分证明了它的稳定性和趋同性。我们对有限地平线Q-学习的稳定性和趋同性的分析完全基于普通的差别方程(O.D.E)方法。我们还展示了我们算法在随机的MDP设置和智能电网应用方面的性能。

0

相关内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

新基因DDA1调控细胞周期蛋白Cyclin D1在肺癌发生与发展中的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

信号通路XBP1-p21在细胞周期调控中的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ZNF191调控Wnt信号通路促进肝癌细胞增殖分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB4通路激活介导非小细胞肺癌EGFR-TKIs获得性耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

从Wnt通路靶向调节OPN探讨骨关节炎发病的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Periostin在肝细胞癌抗血管生成靶向治疗中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Neural Moving Horizon Estimation for Robust Flight Control

Neural Moving Horizon Estimation for Robust Flight Control

Arxiv

0+阅读 · 2022年6月22日

Gradient-Enhanced Physics-Informed Neural Networks for Power Systems Operational Support

Arxiv

0+阅读 · 2022年6月21日

Joint Energy Dispatch and Unit Commitment in Microgrids Based on Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems

Arxiv

0+阅读 · 2022年6月20日

Stability of Finite Horizon Optimisation based Control without Terminal Weight

Arxiv

0+阅读 · 2022年6月20日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Near-Optimal No-Regret Learning for General Convex Games

Arxiv

0+阅读 · 2022年6月20日

Faster Algorithms for Learning Convex Functions

Arxiv

0+阅读 · 2022年6月19日

Anderson acceleration with approximate calculations: applications to scientific computing

Arxiv

0+阅读 · 2022年6月18日

Universal Complexity Bounds Based on Value Iteration and Application to Entropy Games

Arxiv

0+阅读 · 2022年6月17日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Neural Moving Horizon Estimation for Robust Flight Control

Neural Moving Horizon Estimation for Robust Flight Control

Arxiv

0+阅读 · 2022年6月22日

Gradient-Enhanced Physics-Informed Neural Networks for Power Systems Operational Support

Arxiv

0+阅读 · 2022年6月21日

Joint Energy Dispatch and Unit Commitment in Microgrids Based on Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems

Arxiv

0+阅读 · 2022年6月20日

Stability of Finite Horizon Optimisation based Control without Terminal Weight

Arxiv

0+阅读 · 2022年6月20日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Near-Optimal No-Regret Learning for General Convex Games

Arxiv

0+阅读 · 2022年6月20日

Faster Algorithms for Learning Convex Functions

Arxiv

0+阅读 · 2022年6月19日

Anderson acceleration with approximate calculations: applications to scientific computing

Arxiv

0+阅读 · 2022年6月18日

Universal Complexity Bounds Based on Value Iteration and Application to Entropy Games

Arxiv

0+阅读 · 2022年6月17日

相关基金

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

新基因DDA1调控细胞周期蛋白Cyclin D1在肺癌发生与发展中的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

信号通路XBP1-p21在细胞周期调控中的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ZNF191调控Wnt信号通路促进肝癌细胞增殖分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB4通路激活介导非小细胞肺癌EGFR-TKIs获得性耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

从Wnt通路靶向调节OPN探讨骨关节炎发病的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Periostin在肝细胞癌抗血管生成靶向治疗中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员