离线强化学习综述：分类、评论和开放问题 (A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems) - 专知论文

会员服务 ·

0

离线强化学习 · 算法 · 强化学习 · 数据集 · 综述 ·

2023 年 4 月 19 日

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

翻译：离线强化学习综述：分类、评论和开放问题

Rafael Figueiredo Prudencio,Marcos R. O. A. Maximo,Esther Luna Colombini

from arxiv, 21 pages; Final version accepted to IEEE Transactions on Neural Networks and Learning Systems

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining conversations with humans, and controlling robotic agents. However, there is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment. Offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions, making it feasible to extract policies from large and diverse training datasets. Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications, such as education, healthcare, and robotics. In this work, we contribute with a unifying taxonomy to classify offline RL methods. Furthermore, we provide a comprehensive review of the latest algorithmic breakthroughs in the field using a unified notation as well as a review of existing benchmarks' properties and shortcomings. Additionally, we provide a figure that summarizes the performance of each method and class of methods on different dataset properties, equipping researchers with the tools to decide which type of algorithm is best suited for the problem at hand and identify which classes of algorithms look the most promising. Finally, we provide our perspective on open problems and propose future research directions for this rapidly growing field.

翻译：随着深度学习的广泛应用，强化学习（RL）在使用像素观测对复杂游戏进行游戏，维持与人类的对话以及控制机器人代理等原本棘手的问题上经历了显著的增长。然而，仍有许多领域无法通过RL访问，因为与环境的交互成本高昂且危险。离线RL是一种仅从以前收集的相互作用的静态数据集中学习的模式，使从大型和多样化的训练数据集中提取策略成为可能。与在线RL相比，有效的离线RL算法具有更广泛的应用范围，特别适用于现实世界的应用，如教育、医疗保健和机器人领域。在本文中，我们提出了一个统一的分类法来对离线RL方法进行分类。此外，我们使用统一符号对该领域最新的算法突破进行全面的评论，以及现有基准性能的评论，并提供概括各种方法和方法类在不同数据集属性上表现的图表，为研究人员提供工具来决定哪种类型的算法最适合处理手头的问题并确定哪些算法类看起来最有前途。最后，我们提供了我们对于开放问题的看法，并提出了这个快速增长领域的未来研究方向。

0

相关内容

离线强化学习

离线强化学习

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

专知会员服务

59+阅读 · 2020年2月6日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

不可错过！斯坦福《图学习》研讨会，Jure Leskovec主持，附slides！

不可错过！斯坦福《图学习》研讨会，Jure Leskovec主持，附slides！

图与推荐

0+阅读 · 2022年10月7日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀疏植被覆盖条件下土壤盐渍化高光谱遥感定量反演与动态监测

国家自然科学基金

0+阅读 · 2014年12月31日

地表温度-植被覆盖度特征空间蒸散发遥感反演的空间尺度效应及干湿边确定方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空统计方法的多源定量遥感产品融合方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

干旱区荒漠稀疏植被叶面积指数遥感反演及多尺度验证

国家自然科学基金

0+阅读 · 2011年12月31日

基于非下采样Contourlet变换的多源影像自动匹配方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

机会网络小波多分辨数据收集方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

复杂地形条件的地表反照率遥感反演与尺度效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

Sink游牧型传感器网络中基于机会协作的数据获取方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

VIP会员

文章信息

相关主题

离线强化学习

相关VIP内容

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

专知会员服务

59+阅读 · 2020年2月6日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

不可错过！斯坦福《图学习》研讨会，Jure Leskovec主持，附slides！

不可错过！斯坦福《图学习》研讨会，Jure Leskovec主持，附slides！

图与推荐

0+阅读 · 2022年10月7日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

相关基金

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀疏植被覆盖条件下土壤盐渍化高光谱遥感定量反演与动态监测

国家自然科学基金

0+阅读 · 2014年12月31日

地表温度-植被覆盖度特征空间蒸散发遥感反演的空间尺度效应及干湿边确定方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空统计方法的多源定量遥感产品融合方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

干旱区荒漠稀疏植被叶面积指数遥感反演及多尺度验证

国家自然科学基金

0+阅读 · 2011年12月31日

基于非下采样Contourlet变换的多源影像自动匹配方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

机会网络小波多分辨数据收集方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

复杂地形条件的地表反照率遥感反演与尺度效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

Sink游牧型传感器网络中基于机会协作的数据获取方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员