临时展期后继代表 (Temporally Extended Successor Representations) - 专知论文

会员服务 ·

0

Learning · 表示 · 自顶向下 · 自下而上 · Less ·

2022 年 9 月 25 日

Temporally Extended Successor Representations

翻译：临时展期后继代表

Matthew J. Sargent,Peter J. Bentley,Caswell Barry,William de Cothi

from arxiv, Presented at the 5th Multi-Disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2022

We present a temporally extended variation of the successor representation, which we term t-SR. t-SR captures the expected state transition dynamics of temporally extended actions by constructing successor representations over primitive action repeats. This form of temporal abstraction does not learn a top-down hierarchy of pertinent task structures, but rather a bottom-up composition of coupled actions and action repetitions. This lessens the amount of decisions required in control without learning a hierarchical policy. As such, t-SR directly considers the time horizon of temporally extended action sequences without the need for predefined or domain-specific options. We show that in environments with dynamic reward structure, t-SR is able to leverage both the flexibility of the successor representation and the abstraction afforded by temporally extended actions. Thus, in a series of sparsely rewarded gridworld environments, t-SR optimally adapts learnt policies far faster than comparable value-based, model-free reinforcement learning methods. We also show that the manner in which t-SR learns to solve these tasks requires the learnt policy to be sampled consistently less often than non-temporally extended policies.

翻译：我们用T-SR.t-SR.t-SR来表示后继代表制的暂时性变异,我们称之为t-SR.t-SR,通过在原始行动重复时建立后继代表制来捕捉时间性延长行动的预期国家过渡动态。这种暂时性抽象化的形式并不从上到下地学习相关任务结构的等级,而是从上到下地学习,同时由同时行动和行动重复组成的自下而上构成。这减少了在不学习等级政策的情况下在控制中所需的决策量。因此,t-SR直接考虑时间性延长的行动序列的时间范围,而不需要预先确定或具体领域的选择。我们表明,在具有动态奖励结构的环境中,t-SR既能够利用后继代表制的灵活性,又能够利用时间性延长行动提供的抽象性。因此,在一系列微薄无报酬的网格世界环境中,t-SR最优地调整了所学过的政策,其速度远远超过以价值为基础的无型强化学习方法。我们还表明,t-SR学会解决这些任务的方式要求学习的政策往往比不同时延长的政策要经常抽样。

0

相关内容

Learning

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Akt磷酸化Prohibitin介导其线粒体转位促进膀胱癌的增殖

国家自然科学基金

0+阅读 · 2014年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

化痰通脉饮对PCOS的IRS-1-PI3K/AKT/NF-κB串流失控的调节效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

抵抗素在膀胱癌发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

复合污染条件下重金属与有机污染物在石墨烯纳米材料上的交互吸附/解吸机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

重金属离子在功能化纤蛇纹石纳米管上的吸附性能及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

必需调控系统PhoPQ快速起源及与细菌有丝分裂的调控关系

国家自然科学基金

0+阅读 · 2012年12月31日

砷暴露人群DNA甲基化与地砷病及尿砷代谢模式的关系

国家自然科学基金

0+阅读 · 2012年12月31日

纳米杂化双光子吸收无机功能材料的构筑与机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Exploring Effects of Computational Parameter Changes to Image Recognition Systems

Arxiv

0+阅读 · 2022年11月1日

LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in both Project-based and Transfer Learning Settings

Arxiv

0+阅读 · 2022年11月1日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change)

Arxiv

0+阅读 · 2022年10月29日

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Arxiv

0+阅读 · 2022年10月29日

Neural Network based Formation of Cognitive Maps of Semantic Spaces and the Emergence of Abstract Concepts

Arxiv

0+阅读 · 2022年10月28日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Exploring Effects of Computational Parameter Changes to Image Recognition Systems

Arxiv

0+阅读 · 2022年11月1日

LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in both Project-based and Transfer Learning Settings

Arxiv

0+阅读 · 2022年11月1日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change)

Arxiv

0+阅读 · 2022年10月29日

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Arxiv

0+阅读 · 2022年10月29日

Neural Network based Formation of Cognitive Maps of Semantic Spaces and the Emergence of Abstract Concepts

Arxiv

0+阅读 · 2022年10月28日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

相关基金

Akt磷酸化Prohibitin介导其线粒体转位促进膀胱癌的增殖

国家自然科学基金

0+阅读 · 2014年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

化痰通脉饮对PCOS的IRS-1-PI3K/AKT/NF-κB串流失控的调节效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

抵抗素在膀胱癌发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

复合污染条件下重金属与有机污染物在石墨烯纳米材料上的交互吸附/解吸机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

重金属离子在功能化纤蛇纹石纳米管上的吸附性能及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

必需调控系统PhoPQ快速起源及与细菌有丝分裂的调控关系

国家自然科学基金

0+阅读 · 2012年12月31日

砷暴露人群DNA甲基化与地砷病及尿砷代谢模式的关系

国家自然科学基金

0+阅读 · 2012年12月31日

纳米杂化双光子吸收无机功能材料的构筑与机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员