具有深变动引因的学习不公状态抽象学 (Learning Discrete State Abstractions With Deep Variational Inference) - 专知论文

会员服务 ·

0

离散化 · 学成 · 向量空间 · Continuity · 推断 ·

2021 年 1 月 11 日

Learning Discrete State Abstractions With Deep Variational Inference

翻译：具有深变动引因的学习不公状态抽象学

Ondrej Biza,Robert Platt,Jan-Willem van de Meent,Lawson L. S. Wong

from arxiv, 15 pages, 7 figures

Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose an information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through this learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments. Source code is available at https://github.com/ondrejba/discrete_abstractions.

翻译：抽象化对于在有较大国家空间的领域有效进行连续决策至关重要。在这项工作中, 我们提议了一种信息瓶颈方法, 用于学习近似闪烁, 一种状态抽象。我们使用深神经编码器将状态映射到连续嵌入中。我们用一个以行动为条件的隐藏 Markov 模型将这些嵌入到一个离散的表示器上, 该模型是经过神经网络培训的端对端模式。我们的方法适合具有高维度状态的环境, 并学习由在Markov 决策过程中行事的代理人收集的一流经验。通过这一学习的离散抽象模型, 我们可以在多目标强化学习环境中有效地规划不可见的目标。我们用图像状态测试我们在简化机器人操纵域中的方法。我们还将它与以往在离散电网- 世界类似环境中寻找刺激的模型方法进行比较。源代码可在 https://github.com/ ondrejba/ discrete_ abstractions 上查阅。

0

相关内容

离散化

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【北航】深度学习编译器综述|The Deep Learning Compiler: A Comprehensive Survey

【北航】深度学习编译器综述|The Deep Learning Compiler: A Comprehensive Survey

专知会员服务

38+阅读 · 2020年2月11日

49篇ICLR2020高分「图机器学习GML」接受论文及代码

49篇ICLR2020高分「图机器学习GML」接受论文及代码

专知会员服务

62+阅读 · 2020年1月18日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【图机器学习论文】基于深度学习的网络生物学（Deep Learning for Network Biology）

【图机器学习论文】基于深度学习的网络生物学（Deep Learning for Network Biology）

专知会员服务

11+阅读 · 2019年12月16日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Off-Belief Learning

Arxiv

0+阅读 · 2021年3月6日

Preference-based Learning of Reward Function Features

Arxiv

0+阅读 · 2021年3月3日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

Discovering Discrete Latent Topics with Neural Variational Inference

Arxiv

9+阅读 · 2018年5月21日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Variational Inference In Pachinko Allocation Machines

Arxiv

8+阅读 · 2018年4月21日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【北航】深度学习编译器综述|The Deep Learning Compiler: A Comprehensive Survey

【北航】深度学习编译器综述|The Deep Learning Compiler: A Comprehensive Survey

专知会员服务

38+阅读 · 2020年2月11日

49篇ICLR2020高分「图机器学习GML」接受论文及代码

49篇ICLR2020高分「图机器学习GML」接受论文及代码

专知会员服务

62+阅读 · 2020年1月18日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【图机器学习论文】基于深度学习的网络生物学（Deep Learning for Network Biology）

【图机器学习论文】基于深度学习的网络生物学（Deep Learning for Network Biology）

专知会员服务

11+阅读 · 2019年12月16日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Off-Belief Learning

Arxiv

0+阅读 · 2021年3月6日

Preference-based Learning of Reward Function Features

Arxiv

0+阅读 · 2021年3月3日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

Discovering Discrete Latent Topics with Neural Variational Inference

Arxiv

9+阅读 · 2018年5月21日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Variational Inference In Pachinko Allocation Machines

Arxiv

8+阅读 · 2018年4月21日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

微信扫码咨询专知VIP会员