A Theory of State Abstraction for Reinforcement Learning

2019 年 1 月 12 日 CreateAMind

A Theory of State Abstraction for Reinforcement Learning 


David Abel Department of Computer Science Brown University david_abel@brown.edu


Abstract 


Reinforcement learning presents a challenging problem: agents must generalize experiences, efficiently explore the world, and learn from feedback that is delayed and often sparse, all while making use of a limited computational budget. Abstraction is essential to all of these endeavors. Through abstraction, agents can form concise models of both their surroundings and behavior, supporting effective decision making in diverse and complex environments. To this end, the goal of my doctoral research is to characterize the role abstraction plays in reinforcement learning, with a focus on state abstraction. I offer three desiderata articulating what it means for a state abstraction to be useful, and introduce classes of state abstractions that provide a partial path toward satisfying these desiderata. Collectively, I develop theory for state abstractions that can 1) preserve near-optimal behavior, 2) be learned and computed efficiently, and 3) can lower the time or data needed to make effective decisions. I close by discussing extensions of these results to an information theoretic paradigm of abstraction, and an extension to hierarchical abstraction that enjoys the same desirable properties. 


1 Introduction 

The focus of my doctoral research is on clarifying the representational practices that underlie effective Reinforcement Learning (RL), drawing on Information Theory, Computational Complexity, and Computational Learning Theory. The guiding question of my research is: “How do intelligent agents come up with the right abstract understanding of the worlds they inhabit?”, as pictured in Figure 1. I study this question by isolating and addressing its simplest unanswered forms through a mixture of theoretical analysis and experimentation. 

My interest in this question stems from its foundational role in many aspects of learning and decision making: agents can’t model everything in their environment, but must necessarily pick up on something about their surroundings in order to explore, plan far into the future, generalize, solve credit assignment, communicate, and efficiently solve problems. Abstraction is essential to all of these endeavors: through abstraction, agents can construct models of both their surroundings and behavior that are compressed and useful. The




登录查看更多
0

相关内容

强化学习(RL)是机器学习的一个领域,与软件代理应如何在环境中采取行动以最大化累积奖励的概念有关。除了监督学习和非监督学习外,强化学习是三种基本的机器学习范式之一。 强化学习与监督学习的不同之处在于,不需要呈现带标签的输入/输出对,也不需要显式纠正次优动作。相反,重点是在探索(未知领域)和利用(当前知识)之间找到平衡。 该环境通常以马尔可夫决策过程(MDP)的形式陈述,因为针对这种情况的许多强化学习算法都使用动态编程技术。经典动态规划方法和强化学习算法之间的主要区别在于,后者不假设MDP的确切数学模型,并且针对无法采用精确方法的大型MDP。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等
深度强化学习策略梯度教程,53页ppt
专知会员服务
184+阅读 · 2020年2月1日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
60+阅读 · 2019年10月17日
强化学习最新教程,17页pdf
专知会员服务
182+阅读 · 2019年10月11日
Transferring Knowledge across Learning Processes
CreateAMind
29+阅读 · 2019年5月18日
逆强化学习-学习人先验的动机
CreateAMind
16+阅读 · 2019年1月18日
Unsupervised Learning via Meta-Learning
CreateAMind
43+阅读 · 2019年1月3日
Reinforcement Learning: An Introduction 2018第二版 500页
CreateAMind
14+阅读 · 2018年4月27日
强化学习 cartpole_a3c
CreateAMind
9+阅读 · 2017年7月21日
Learning by Abstraction: The Neural State Machine
Arxiv
6+阅读 · 2019年7月11日
Risk-Aware Active Inverse Reinforcement Learning
Arxiv
8+阅读 · 2019年1月8日
Arxiv
7+阅读 · 2018年12月26日
VIP会员
相关VIP内容
深度强化学习策略梯度教程,53页ppt
专知会员服务
184+阅读 · 2020年2月1日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
60+阅读 · 2019年10月17日
强化学习最新教程,17页pdf
专知会员服务
182+阅读 · 2019年10月11日
相关资讯
Transferring Knowledge across Learning Processes
CreateAMind
29+阅读 · 2019年5月18日
逆强化学习-学习人先验的动机
CreateAMind
16+阅读 · 2019年1月18日
Unsupervised Learning via Meta-Learning
CreateAMind
43+阅读 · 2019年1月3日
Reinforcement Learning: An Introduction 2018第二版 500页
CreateAMind
14+阅读 · 2018年4月27日
强化学习 cartpole_a3c
CreateAMind
9+阅读 · 2017年7月21日
相关论文
Top
微信扫码咨询专知VIP会员