多任务政策培训的简单新兴行动代表 (Simple Emergent Action Representations from Multi-Task Policy Training) - 专知论文

会员服务 ·

0

表示 · Learning · SimPLe · 可理解性 · HTTPS ·

2022 年 10 月 18 日

Simple Emergent Action Representations from Multi-Task Policy Training

翻译：多任务政策培训的简单新兴行动代表

Pu Hua,Yubei Chen,Huazhe Xu

from arxiv, 17 pages, 8 figures

Low-level sensory and motor signals in the high-dimensional spaces (e.g., image observations or motor torques) in deep reinforcement learning are complicated to understand or harness for downstream tasks directly. While sensory representations have been widely studied, the representations of actions that form motor skills are yet under exploration. In this work, we find that when a multi-task policy network takes as input states and task embeddings, a space based on the task embeddings emerges to contain meaningful action representations with moderate constraints. Within this space, interpolated or composed embeddings can serve as a high-level interface to instruct the agent to perform meaningful action sequences. Empirical results not only show that the proposed action representations have efficacy for intra-action interpolation and inter-action composition with limited or no learning, but also demonstrate their superior ability in task adaptation to strong baselines in Mujoco locomotion tasks. The evidence elucidates that learning action representations is a promising direction toward efficient, adaptable, and composable RL, forming the basis of abstract action planning and the understanding of motor signal space. Anonymous project page: https://sites.google.com/view/emergent-action-representation/

翻译：深层强化学习中的高维空间(如图像观测或发动机外壳)的低层次感官和运动信号(如图像观测或发动机外壳)很难直接理解或控制下游任务。虽然对感官表现进行了广泛研究,但构成运动技能的行动的表示仍在探索之中。在这项工作中,我们发现,当多任务政策网络以投入状态和任务嵌入作为投入状态和任务嵌入点时,基于任务嵌入的空间将包含有意义的行动表现,但有适度限制。在这个空间中,相互交织或构成的嵌入可以作为高级界面,指示代理人执行有意义的行动序列。情感结果不仅表明提议的行动表现对行动内部的相互调和相互作用构成具有效力,而且学习不多或没有学习,而且还表明它们更有能力根据Mujoco loco 移动任务中的强基线调整任务。证据说明,学习行动表现是朝高效、适应性和可调适度的RL方向的一个有希望的方向,在这个空间中,形成抽象行动规划和了解运动信号空间的基础。匿名项目页: https://site/gosiction-gles/glegentres/signalimmationalmentalmationpalpalpalpalpalpagepact pagepagepact pagepagepact:http:http:http:http://smactpalpalpalpalpalpalpalp:http:https:http:http:http:http://s://smmmmmmationsmmmmmationsmmationalpalpalpalpalpalpalp=/smmtionpmationsmationsmationsmationsmation:http:http:http:http:http:http:http:http:http:http:https://smctionsmactalctionsmp/smp/smp/smpsmpsmactimctionp=mactactactactactactalpalpalpalp=smp/smtionp=smtionp。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

面向服务智能协同的农业物联网动态自治与资源优化配置

国家自然科学基金

0+阅读 · 2014年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于应力相位角理论的冠状动脉粥样硬化机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

信息物理融合系统的随机行为建模与验证方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态余量的EMU能量模型与策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt信号通路介导非甾类抗炎药对骨关节炎软骨作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

乳腺癌化疗所致记忆障碍的脑机制及其康复的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Discovering Evolution Strategies via Meta-Black-Box Optimization

Arxiv

0+阅读 · 2022年11月25日

Multi-dataset Training of Transformers for Robust Action Recognition

Arxiv

0+阅读 · 2022年11月25日

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Arxiv

0+阅读 · 2022年11月24日

Improving Multi-task Learning via Seeking Task-based Flat Regions

Arxiv

0+阅读 · 2022年11月24日

BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning

Arxiv

0+阅读 · 2022年11月24日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Discovering Evolution Strategies via Meta-Black-Box Optimization

Arxiv

0+阅读 · 2022年11月25日

Multi-dataset Training of Transformers for Robust Action Recognition

Arxiv

0+阅读 · 2022年11月25日

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Arxiv

0+阅读 · 2022年11月24日

Improving Multi-task Learning via Seeking Task-based Flat Regions

Arxiv

0+阅读 · 2022年11月24日

BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning

Arxiv

0+阅读 · 2022年11月24日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

面向服务智能协同的农业物联网动态自治与资源优化配置

国家自然科学基金

0+阅读 · 2014年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于应力相位角理论的冠状动脉粥样硬化机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

信息物理融合系统的随机行为建模与验证方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态余量的EMU能量模型与策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt信号通路介导非甾类抗炎药对骨关节炎软骨作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

乳腺癌化疗所致记忆障碍的脑机制及其康复的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员