当数据几何与深功能相匹配时:一般离线强化学习 (When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning) - 专知论文

会员服务 ·

0

泛函 · Q函数 · 泛化理论 · Learning · Better ·

2023 年 2 月 8 日

When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning

翻译：当数据几何与深功能相匹配时:一般离线强化学习

Jianxiong Li,Xianyuan Zhan,Haoran Xu,Xiangyu Zhu,Jingjing Liu,Ya-Qin Zhang

from arxiv, ICLR2023

In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas. Unfortunately, existing offline RL methods are often over-conservative, inevitably hurting generalization performance outside data distribution. In our study, one interesting observation is that deep Q functions approximate well inside the convex hull of training data. Inspired by this, we propose a new method, DOGE (Distance-sensitive Offline RL with better GEneralization). DOGE marries dataset geometry with deep function approximators in offline RL, and enables exploitation in generalizable OOD areas rather than strictly constraining policy within data distribution. Specifically, DOGE trains a state-conditioned distance function that can be readily plugged into standard actor-critic methods as a policy constraint. Simple yet elegant, our algorithm enjoys better generalization compared to state-of-the-art methods on D4RL benchmarks. Theoretical analysis demonstrates the superiority of our approach to existing methods that are solely based on data distribution or support constraints.

翻译：在离线强化学习(RL)中,对政策学习有害的一个问题是,在分布区外(OOD)深Q功能的错误积累。不幸的是,现有的离线RL方法往往过于保守,不可避免地伤害数据分布以外的一般性工作。在我们的研究中,一个有趣的观察是,深度Q功能在培训数据的锥体内很接近。受此启发,我们提出了一个新方法,即DGE(对区分敏感的离线脱线RL,更好的Generalization)。DOGE将数据设置的几何与离线区外(OOOD)的深功能对应,并允许在可通用OOD区域进行利用,而不是在数据分布内严格限制政策。具体地说,DGE培训一个有条件的远程功能,可以很容易地插入标准的行为者-批评方法,作为一种政策制约。简单而优雅的我们的算法比D4RL基准的状态方法更加普遍化。理论分析表明,我们采用的方法优于仅基于数据分布或支持限制的现有方法。

0

相关内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

木聚糖高温水解关键酶分子改造及其热稳定和耐糖机理

国家自然科学基金

0+阅读 · 2012年12月31日

乙型肝炎病毒剪接蛋白与TGFβ1I1相互作用在肝硬化、肝癌过程中的致病机制

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

非对易空间和非对易相空间中的量子物理

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Conflict-Averse Gradient Optimization of Ensembles for Effective Offline Model-Based Optimization

Arxiv

0+阅读 · 2023年3月31日

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

Arxiv

0+阅读 · 2023年3月31日

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月31日

A Slow-Shifting Concerned Machine Learning Method for Short-term Traffic Flow Forecasting

Arxiv

0+阅读 · 2023年3月31日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

发射器定位中的传感器路径规划研究 | 235页

战略无人机 | 2025最新80页

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

无人机对机动战的影响 | 2025最新文献

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Conflict-Averse Gradient Optimization of Ensembles for Effective Offline Model-Based Optimization

Arxiv

0+阅读 · 2023年3月31日

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

Arxiv

0+阅读 · 2023年3月31日

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月31日

A Slow-Shifting Concerned Machine Learning Method for Short-term Traffic Flow Forecasting

Arxiv

0+阅读 · 2023年3月31日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

木聚糖高温水解关键酶分子改造及其热稳定和耐糖机理

国家自然科学基金

0+阅读 · 2012年12月31日

乙型肝炎病毒剪接蛋白与TGFβ1I1相互作用在肝硬化、肝癌过程中的致病机制

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

非对易空间和非对易相空间中的量子物理

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员