强化学习最新教程，17页pdf - 专知VIP

会员服务 ·

45

强化学习 · 教程 ·

2019 年 10 月 11 日

强化学习最新教程，17页pdf

专知会员服务

专知，提供专业可信的知识分发服务，让认知协作更快更好！

The tutorial is written for those who would like an introduction to reinforcement learning (RL). The aim is to provide an intuitive presentation of the ideas rather than concentrate on the deeper mathematics underlying the topic. RL is generally used to solve the so-called Markov decision problem (MDP). In other words, the problem that you are attempting to solve with RL should be an MDP or its variant. The theory of RL relies on dynamic programming (DP) and artificial intelligence (AI). We will begin with a quick description of MDPs. We will discuss what we mean by “complex” and “large-scale” MDPs. Then we will explain why RL is needed to solve complex and large-scale MDPs. The semi-Markov decision problem (SMDP) will also be covered.

The tutorial is meant to serve as an introduction to these topics and is based mostly on the book: “Simulation-based optimization: Parametric Optimization techniques and reinforcement learning” [4]. The book discusses this topic in greater detail in the context of simulators. There are at least two other textbooks that I would recommend you to read: (i) Neuro-dynamic programming [2] (lots of details on convergence analysis) and (ii) Reinforcement Learning: An Introduction [11] (lots of details on underlying AI concepts). A more recent tutorial on this topic is [8]. This tutorial has 2 sections: • Section 2 discusses MDPs and SMDPs. • Section 3 discusses RL. By the end of this tutorial, you should be able to • Identify problem structures that can be set up as MDPs / SMDPs. • Use some RL algorithms.

相关内容

强化学习

强化学习（RL）是机器学习的一个领域，与软件代理应如何在环境中采取行动以最大化累积奖励的概念有关。除了监督学习和非监督学习外，强化学习是三种基本的机器学习范式之一。强化学习与监督学习的不同之处在于，不需要呈现带标签的输入/输出对，也不需要显式纠正次优动作。相反，重点是在探索（未知领域）和利用（当前知识）之间找到平衡。该环境通常以马尔可夫决策过程（MDP）的形式陈述，因为针对这种情况的许多强化学习算法都使用动态编程技术。经典动态规划方法和强化学习算法之间的主要区别在于，后者不假设MDP的确切数学模型，并且针对无法采用精确方法的大型MDP。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

《强化学习》简介小册，24页pdf

《强化学习》简介小册，24页pdf

专知会员服务

277+阅读 · 2020年4月19日

【教程】自然语言处理中的迁移学习原理，41 页PPT

【教程】自然语言处理中的迁移学习原理，41 页PPT

专知会员服务

96+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

OpenAI官方发布：强化学习中的关键论文

OpenAI官方发布：强化学习中的关键论文

专知

14+阅读 · 2018年12月12日

【微软亚研130PPT教程】强化学习简介

【微软亚研130PPT教程】强化学习简介

专知

36+阅读 · 2018年10月26日

【伯克利大学ICML2018强化学习80页教程】【附下载】

【伯克利大学ICML2018强化学习80页教程】【附下载】

专知

10+阅读 · 2018年7月21日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo

gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo

Arxiv

7+阅读 · 2019年3月14日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

Self-Driving Cars: A Survey

Self-Driving Cars: A Survey

Arxiv

41+阅读 · 2019年1月14日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Meta-Learning: A Survey

Arxiv

136+阅读 · 2018年10月8日

ViZDoom Competitions: Playing Doom from Pixels

ViZDoom Competitions: Playing Doom from Pixels

Arxiv

5+阅读 · 2018年9月10日

A survey on policy search algorithms for learning robot controllers in a handful of trials

Arxiv

3+阅读 · 2018年7月6日

A Tour of Reinforcement Learning: The View from Continuous Control

Arxiv

6+阅读 · 2018年6月25日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

相关主题

相关VIP内容

《强化学习》简介小册，24页pdf

《强化学习》简介小册，24页pdf

专知会员服务

277+阅读 · 2020年4月19日

【教程】自然语言处理中的迁移学习原理，41 页PPT

【教程】自然语言处理中的迁移学习原理，41 页PPT

专知会员服务

96+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《军事远程操作中的自动语音识别与多模态交互技术》最新报告

《人工智能与预测性健康管理（PHM）技术在军事智能装备保障中的应用》

人工智能无人机：传统军事优势面临的新挑战

《定向能武器对无人机核心系统及体外细胞动力学影响的深度剖析》最新28页报告

相关资讯

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

OpenAI官方发布：强化学习中的关键论文

OpenAI官方发布：强化学习中的关键论文

专知

14+阅读 · 2018年12月12日

【微软亚研130PPT教程】强化学习简介

【微软亚研130PPT教程】强化学习简介

专知

36+阅读 · 2018年10月26日

【伯克利大学ICML2018强化学习80页教程】【附下载】

【伯克利大学ICML2018强化学习80页教程】【附下载】

专知

10+阅读 · 2018年7月21日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo

gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo

Arxiv

7+阅读 · 2019年3月14日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

Self-Driving Cars: A Survey

Self-Driving Cars: A Survey

Arxiv

41+阅读 · 2019年1月14日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Meta-Learning: A Survey

Arxiv

136+阅读 · 2018年10月8日

ViZDoom Competitions: Playing Doom from Pixels

ViZDoom Competitions: Playing Doom from Pixels

Arxiv

5+阅读 · 2018年9月10日

A survey on policy search algorithms for learning robot controllers in a handful of trials

Arxiv

3+阅读 · 2018年7月6日

A Tour of Reinforcement Learning: The View from Continuous Control

Arxiv

6+阅读 · 2018年6月25日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

微信扫码咨询专知VIP会员