与深RELU网络进行离线强化学习的复杂性 (Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks) - 专知论文

会员服务 ·

0

样本复杂度 · Networking · ReLU · 泛函 · 近似 ·

2022 年 12 月 13 日

Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks

翻译：与深RELU网络进行离线强化学习的复杂性

Thanh Nguyen-Tang,Sunil Gupta,Hung Tran-The,Svetha Venkatesh

from arxiv, https://openreview.net/forum?id=LdEm0umNcv

Offline reinforcement learning (RL) leverages previously collected data for policy optimization without any further active exploration. Despite the recent interest in this problem, its theoretical results in neural network function approximation settings remain elusive. In this paper, we study the statistical theory of offline RL with deep ReLU network function approximation. In particular, we establish the sample complexity of $n = \tilde{\mathcal{O}}( H^{4 + 4 \frac{d}{\alpha}} \kappa_{\mu}^{1 + \frac{d}{\alpha}} \epsilon^{-2 - 2\frac{d}{\alpha}} )$ for offline RL with deep ReLU networks, where $\kappa_{\mu}$ is a measure of distributional shift, {$H = (1-\gamma)^{-1}$ is the effective horizon length}, $d$ is the dimension of the state-action space, $\alpha$ is a (possibly fractional) smoothness parameter of the underlying Markov decision process (MDP), and $\epsilon$ is a user-specified error. Notably, our sample complexity holds under two novel considerations: the Besov dynamic closure and the correlated structure. While the Besov dynamic closure subsumes the dynamic conditions for offline RL in the prior works, the correlated structure renders the prior works of offline RL with general/neural network function approximation improper or inefficient {in long (effective) horizon problems}. To the best of our knowledge, this is the first theoretical characterization of the sample complexity of offline RL with deep neural network function approximation under the general Besov regularity condition that goes beyond {the linearity regime} in the traditional Reproducing Hilbert kernel spaces and Neural Tangent Kernels.

翻译：离线强化学习 (RL) 利用先前收集的优化政策的数据,而没有进一步积极探索。尽管最近对这一问题感兴趣, 但它在神经网络运行的理论结果仍然难以找到。在本文中, 我们研究离线 RLL 的统计理论, 与深 ReLU 网络运行。特别是, 我们建立 $ =\ tile\ mathcal{O} (( H ⁇ 4 + 4\\\ frac{ dalpha}\\ kapapala ⁇ _ mu ⁇ 1 +\ frac} (d- halphalpha} =\ eepsilon}-2 - 2\ frac{ d- halpha} 设置的神经网络的理论结果。我们的离线 RLL 运行的常规运行规则系统运行最平稳的参数是 IMOGL IMFIL 的常规运行法。

0

相关内容

样本复杂度

样本复杂度

深度强化学习方法及其在经济学中的应用综述，Comprehensive Review of Deep Reinforcement Learning Methods and Applicationsin Economic

深度强化学习方法及其在经济学中的应用综述，Comprehensive Review of Deep Reinforcement Learning Methods and Applicationsin Economic

专知会员服务

52+阅读 · 2020年4月7日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

HOXB7基因促胃癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

食管癌高发区人群癌前病变的外周血中miRNAs筛查标志物研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国地区气象要素的多时空信息特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米间隙耦合增强表面等离子体共振扫描显微镜的研制及应用

国家自然科学基金

0+阅读 · 2012年12月31日

宇宙暗物质和弱引力透镜功率谱的信息量研究

国家自然科学基金

0+阅读 · 2011年12月31日

Deep Offline Reinforcement Learning for Real-World Treatment Optimization Applications

Arxiv

0+阅读 · 2023年2月15日

Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

Arxiv

0+阅读 · 2023年2月15日

Learning a model is paramount for sample efficiency in reinforcement learning control of PDEs

Arxiv

1+阅读 · 2023年2月14日

Conservative State Value Estimation for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年2月14日

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

Arxiv

0+阅读 · 2023年2月13日

Reinforcement Learning with Almost Sure Constraints

Reinforcement Learning with Almost Sure Constraints

Arxiv

0+阅读 · 2023年2月13日

Learning to Act Safely with Limited Exposure and Almost Sure Certainty

Arxiv

0+阅读 · 2023年2月13日

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

样本复杂度

相关VIP内容

深度强化学习方法及其在经济学中的应用综述，Comprehensive Review of Deep Reinforcement Learning Methods and Applicationsin Economic

深度强化学习方法及其在经济学中的应用综述，Comprehensive Review of Deep Reinforcement Learning Methods and Applicationsin Economic

专知会员服务

52+阅读 · 2020年4月7日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Deep Offline Reinforcement Learning for Real-World Treatment Optimization Applications

Arxiv

0+阅读 · 2023年2月15日

Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

Arxiv

0+阅读 · 2023年2月15日

Learning a model is paramount for sample efficiency in reinforcement learning control of PDEs

Arxiv

1+阅读 · 2023年2月14日

Conservative State Value Estimation for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年2月14日

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

Arxiv

0+阅读 · 2023年2月13日

Reinforcement Learning with Almost Sure Constraints

Reinforcement Learning with Almost Sure Constraints

Arxiv

0+阅读 · 2023年2月13日

Learning to Act Safely with Limited Exposure and Almost Sure Certainty

Arxiv

0+阅读 · 2023年2月13日

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

HOXB7基因促胃癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

食管癌高发区人群癌前病变的外周血中miRNAs筛查标志物研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国地区气象要素的多时空信息特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米间隙耦合增强表面等离子体共振扫描显微镜的研制及应用

国家自然科学基金

0+阅读 · 2012年12月31日

宇宙暗物质和弱引力透镜功率谱的信息量研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员