自然政策梯分法与日志梯度政策的线性趋同 (Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies) - 专知论文

会员服务 ·

0

线性的 · 近似 · PMD · Markov · 样本复杂度 ·

2022 年 10 月 4 日

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

翻译：自然政策梯分法与日志梯度政策的线性趋同

Rui Yuan,Simon S. Du,Robert M. Gower,Alessandro Lazaric,Lin Xiao

We consider infinite-horizon discounted Markov decision processes and study the convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log-linear policy class. Using the compatible function approximation framework, both methods with log-linear policies can be written as approximate versions of the policy mirror descent (PMD) method. We show that both methods attain linear convergence rates and $\mathcal{O}(1/\epsilon^2)$ sample complexities using a simple, non-adaptive geometrically increasing step size, without resorting to entropy or other strongly convex regularization. Lastly, as a byproduct, we obtain sublinear convergence rates for both methods with arbitrary constant step size.

翻译：我们考虑无限顺差折扣Markov决策程序,研究自然政策梯度(NPG)和Q-NPG方法与日对线政策类的趋同率,使用兼容功能近似框架,两种有日对线政策的方法都可以作为政策镜底(PMD)方法的近似版本来写成,我们显示两种方法都达到线性趋同率和1美元=mathcal{O}(1/\epsilon ⁇ 2)的样本复杂度,使用简单、非适应性的几何式递增步数大小,而不用使用引力或其他强烈的矩形规范。最后,作为副产品,我们获得了具有任意不变步数大小的两种方法的次线性趋同率。

0

相关内容

线性的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

135+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

CaWRKY转录因子家族在辣椒灰霉病抗性应答中的调控机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

山葡萄雄株性别CKX基因家族分析与VaCKX的性别转换功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

油菜BnICE1基因与MAP激酶信号途径在调控植物耐寒性中的相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

小麦韧皮部发育中功能蛋白和锌离子对筛分子编程性半死亡的调控研究

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒ORFV119蛋白与宿主细胞相互作用的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

水稻OsCAS（Calcium-sensing Receptor）基因的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

Two types of spectral volume methods for 1-D linear hyperbolic equations with degenerate variable coefficients

Arxiv

0+阅读 · 2022年11月9日

Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations

Arxiv

0+阅读 · 2022年11月9日

A Quasi-Optimal Spectral Solver for the Heat and Poisson Equations in a Closed Cylinder

Arxiv

0+阅读 · 2022年11月9日

Stability estimates for the expected utility in Bayesian optimal experimental design

Arxiv

0+阅读 · 2022年11月8日

A local approach to parameter space reduction for regression and classification tasks

Arxiv

0+阅读 · 2022年11月8日

A $C^0$ Linear Finite Element Method for a Second Order Elliptic Equation in Non-Divergence Form with Cordes Coefficients

Arxiv

0+阅读 · 2022年11月8日

Efficient Identification of Butterfly Sparse Matrix Factorizations

Arxiv

0+阅读 · 2022年11月7日

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

Arxiv

0+阅读 · 2022年11月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

VIP会员

文章信息

相关主题

样本复杂度

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

135+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

相关论文

Two types of spectral volume methods for 1-D linear hyperbolic equations with degenerate variable coefficients

Arxiv

0+阅读 · 2022年11月9日

Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations

Arxiv

0+阅读 · 2022年11月9日

A Quasi-Optimal Spectral Solver for the Heat and Poisson Equations in a Closed Cylinder

Arxiv

0+阅读 · 2022年11月9日

Stability estimates for the expected utility in Bayesian optimal experimental design

Arxiv

0+阅读 · 2022年11月8日

A local approach to parameter space reduction for regression and classification tasks

Arxiv

0+阅读 · 2022年11月8日

A $C^0$ Linear Finite Element Method for a Second Order Elliptic Equation in Non-Divergence Form with Cordes Coefficients

Arxiv

0+阅读 · 2022年11月8日

Efficient Identification of Butterfly Sparse Matrix Factorizations

Arxiv

0+阅读 · 2022年11月7日

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

Arxiv

0+阅读 · 2022年11月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

相关基金

CaWRKY转录因子家族在辣椒灰霉病抗性应答中的调控机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

山葡萄雄株性别CKX基因家族分析与VaCKX的性别转换功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

油菜BnICE1基因与MAP激酶信号途径在调控植物耐寒性中的相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

小麦韧皮部发育中功能蛋白和锌离子对筛分子编程性半死亡的调控研究

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒ORFV119蛋白与宿主细胞相互作用的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

水稻OsCAS（Calcium-sensing Receptor）基因的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员