n- 使用最佳n 进行时间差异学习</s> (n-Step Temporal Difference Learning with Optimal n) - 专知论文

会员服务 ·

0

优化器 · TD · Learning · Continuity · 连续优化 ·

2023 年 3 月 13 日

n-Step Temporal Difference Learning with Optimal n

翻译：n- 使用最佳n 进行时间差异学习

Lakshmi Mandal,Shalabh Bhatnagar

We consider the problem of finding the optimal value of n in the n-step temporal difference (TD) algorithm. We find the optimal n by resorting to the model-free optimization technique of simultaneous perturbation stochastic approximation (SPSA). We adopt a one-simulation SPSA procedure that is originally for continuous optimization to the discrete optimization framework but incorporates a cyclic perturbation sequence. We prove the convergence of our proposed algorithm, SDPSA, and show that it finds the optimal value of n in n-step TD. Through experiments, we show that the optimal value of n is achieved with SDPSA for any arbitrary initial value of the same.

翻译：我们考虑了在正步时间差(TD)算法中找到n的最佳值的问题。我们通过使用同时扰动近似(SPSA)的无模型优化技术找到了最佳n。我们采用了一种一次性的模拟SPSA程序,最初是连续优化到离散优化框架,但采用了循环扰动序列。我们证明了我们提议的SDPSA算法的趋同,并表明它找到了n在正步TD中的最佳值。我们通过实验表明,与SDPSA一起实现n的最佳值与同一任意初始值的最佳值。</s>

0

相关内容

优化器

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

心脏的多形态耦合与层级级联计算可视化方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

高速列车显著高频振动机理及其传递特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向体域网的无线电能传输网络关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向能量回收的集成式高动态液压发电单元关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

大气压感应耦合等离子体在聚酰亚胺上制备铜薄膜机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于平面磁芯的电动汽车磁共振式非接触充电技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ERR-alpha 小分子激动剂及其对糖脂代谢调控的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

Sample Efficient Model-free Reinforcement Learning from LTL Specifications with Optimality Guarantees

Arxiv

0+阅读 · 2023年5月3日

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare

Arxiv

0+阅读 · 2023年5月2日

Risk-Sensitive Reinforcement Learning with Exponential Criteria

Risk-Sensitive Reinforcement Learning with Exponential Criteria

Arxiv

0+阅读 · 2023年5月2日

LogSpecT: Feasible Graph Learning Model from Stationary Signals with Recovery Guarantees

Arxiv

0+阅读 · 2023年5月2日

Optimizing Guided Traversal for Fast Learned Sparse Retrieval

Arxiv

0+阅读 · 2023年5月2日

Adversarial Policy Optimization in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月27日

Linear Optimal Partial Transport Embedding

Arxiv

0+阅读 · 2023年4月27日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《国防领域的人工智能：国防工业基础未来路线图——通过人工智能战略整合、确保安全与开创国防创新》2025最新31页报告

《美海军陆战队训练与教育司令部战役计划2025》最新报告

生成式人工智能的军事应用及路径探讨

《生成式人工智能的军事安全应用：弹性可信部署框架》北约最新51页slides

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Sample Efficient Model-free Reinforcement Learning from LTL Specifications with Optimality Guarantees

Arxiv

0+阅读 · 2023年5月3日

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare

Arxiv

0+阅读 · 2023年5月2日

Risk-Sensitive Reinforcement Learning with Exponential Criteria

Risk-Sensitive Reinforcement Learning with Exponential Criteria

Arxiv

0+阅读 · 2023年5月2日

LogSpecT: Feasible Graph Learning Model from Stationary Signals with Recovery Guarantees

Arxiv

0+阅读 · 2023年5月2日

Optimizing Guided Traversal for Fast Learned Sparse Retrieval

Arxiv

0+阅读 · 2023年5月2日

Adversarial Policy Optimization in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月27日

Linear Optimal Partial Transport Embedding

Arxiv

0+阅读 · 2023年4月27日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

相关基金

心脏的多形态耦合与层级级联计算可视化方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

高速列车显著高频振动机理及其传递特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向体域网的无线电能传输网络关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向能量回收的集成式高动态液压发电单元关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

大气压感应耦合等离子体在聚酰亚胺上制备铜薄膜机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于平面磁芯的电动汽车磁共振式非接触充电技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰安肽（Aglycin）治疗2型糖尿病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ERR-alpha 小分子激动剂及其对糖脂代谢调控的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员