目标有条件强化学习的国家代表性学习 (State Representation Learning for Goal-Conditioned Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · 表示学习 · 强化学习 · Continuity · 表示 ·

2022 年 5 月 4 日

State Representation Learning for Goal-Conditioned Reinforcement Learning

翻译：目标有条件强化学习的国家代表性学习

Lorenzo Steccanella,Anders Jonsson

This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

翻译：本文为无报酬的Markov 决策程序提供了一个新的国家代表。想法是,以自我监督的方式学习嵌入空间,让嵌入国之间的距离与它们之间转型所需的最低行动数量相对应。与以往的方法相比,我们的方法并不要求任何领域知识,从离线和无标签数据中学习。我们展示了如何利用这种代表来学习有目标条件的政策,提供了国家和目标之间的相似性概念,以及指导规划和强化学习算法的有用的超长距离。最后,我们用经验验证了我们在经典控制领域和多目标环境中的方法,表明我们的方法可以成功地在大型和/或连续领域学习。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

粒子湍流介质对太阳光退相干性质影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

量子点耦合系统的输运及耗散动力学

国家自然科学基金

0+阅读 · 2013年12月31日

多孔介质中细观剩余油成因与流动动力学机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用中子散射研究过掺杂Pr1-xLaCexCuO4-y的低能自旋激发

国家自然科学基金

0+阅读 · 2012年12月31日

低维电子系统的噪声谱研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

翼型空化流动的介观模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

糖皮质激素受体在单纯疱疹病毒感染性面瘫中的作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation

Arxiv

0+阅读 · 2022年6月22日

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Constrained Reinforcement Learning for Robotics via Scenario-Based Programming

Arxiv

0+阅读 · 2022年6月20日

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月19日

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

Arxiv

0+阅读 · 2022年6月17日

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年6月17日

Reinforcement Learning with Action-Free Pre-Training from Videos

Arxiv

0+阅读 · 2022年6月16日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation

Arxiv

0+阅读 · 2022年6月22日

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Constrained Reinforcement Learning for Robotics via Scenario-Based Programming

Arxiv

0+阅读 · 2022年6月20日

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月19日

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

Arxiv

0+阅读 · 2022年6月17日

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年6月17日

Reinforcement Learning with Action-Free Pre-Training from Videos

Arxiv

0+阅读 · 2022年6月16日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

相关基金

粒子湍流介质对太阳光退相干性质影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

量子点耦合系统的输运及耗散动力学

国家自然科学基金

0+阅读 · 2013年12月31日

多孔介质中细观剩余油成因与流动动力学机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用中子散射研究过掺杂Pr1-xLaCexCuO4-y的低能自旋激发

国家自然科学基金

0+阅读 · 2012年12月31日

低维电子系统的噪声谱研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

翼型空化流动的介观模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

糖皮质激素受体在单纯疱疹病毒感染性面瘫中的作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员