低地空间基于模型规划的理论空间时时预测编码 (Temporal Predictive Coding For Model-Based Planning In Latent Space) - 专知论文

会员服务 ·

0

INFORMS · MoDELS · 学成 · 回合 · 潜在 ·

2021 年 6 月 14 日

Temporal Predictive Coding For Model-Based Planning In Latent Space

翻译：低地空间基于模型规划的理论空间时时预测编码

Tung Nguyen,Rui Shu,Tuan Pham,Hung Bui,Stefano Ermon

from arxiv, International Conference on Machine Learning

High-dimensional observations are a major challenge in the application of model-based reinforcement learning (MBRL) to real-world environments. To handle high-dimensional sensory inputs, existing approaches use representation learning to map high-dimensional observations into a lower-dimensional latent space that is more amenable to dynamics estimation and planning. In this work, we present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time. Since this approach focuses on encoding temporally-predictable information, we implicitly prioritize the encoding of task-relevant components over nuisance information within the environment that are provably task-irrelevant. By learning this representation in conjunction with a recurrent state space model, we can then perform planning in latent space. We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task. Our experiments show that our model is superior to existing methods in the challenging complex-background setting while remaining competitive with current state-of-the-art models in the standard setting.

翻译：高维观测是将基于模型的强化学习(MBRL)应用到现实世界环境中的一大挑战。要处理高维感官输入, 现有方法使用代表式学习将高维观测绘制成一个更适于动态估计和规划的低维潜层空间。在这项工作中, 我们提出一种信息理论方法, 使用时间预测编码来编码环境中可以随时预测的元素。由于这种方法侧重于对时间- 可预见信息进行编码, 我们隐含地将任务相关部分的编码优先于环境中与任务密切相关的骚扰信息。通过与经常性状态空间模型一起学习这种表述, 我们就可以在潜在空间进行规划。我们评估了我们关于对标准DMM控制任务进行具有挑战性的修改的模式, 其背景被包含复杂但与规划任务无关信息的自然视频所取代。我们的实验表明, 我们的模型优于挑战性复杂背景环境中的现有方法, 同时在标准环境中与当前的最新模型保持竞争力。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Yann Lecun 纽约大学《深度学习(PyTorch)》课程(2020）PPT

Yann Lecun 纽约大学《深度学习(PyTorch)》课程(2020）PPT

专知会员服务

183+阅读 · 2020年3月16日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Efficient Local Planning with Linear Function Approximation

Arxiv

0+阅读 · 2021年8月12日

Custom Distribution for Sampling-Based Motion Planning

Arxiv

0+阅读 · 2021年8月11日

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Arxiv

9+阅读 · 2020年11月28日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Learning Sequence Encoders for Temporal Knowledge Graph Completion

Arxiv

6+阅读 · 2018年9月10日

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Arxiv

6+阅读 · 2018年3月29日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

相关VIP内容

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Yann Lecun 纽约大学《深度学习(PyTorch)》课程(2020）PPT

Yann Lecun 纽约大学《深度学习(PyTorch)》课程(2020）PPT

专知会员服务

183+阅读 · 2020年3月16日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Efficient Local Planning with Linear Function Approximation

Arxiv

0+阅读 · 2021年8月12日

Custom Distribution for Sampling-Based Motion Planning

Arxiv

0+阅读 · 2021年8月11日

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Arxiv

9+阅读 · 2020年11月28日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Learning Sequence Encoders for Temporal Knowledge Graph Completion

Arxiv

6+阅读 · 2018年9月10日

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Arxiv

6+阅读 · 2018年3月29日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员