基于任务不可知性探索的与记忆有关的多任务方法 (A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration) - 专知论文

会员服务 ·

0

Learning · Agent · MoDELS · 知识 (knowledge) · Performer ·

2022 年 9 月 9 日

A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration

翻译：基于任务不可知性探索的与记忆有关的多任务方法

Xianqi Zhang,Xingtao Wang,Xu Liu,Xiaopeng Fan,Debin Zhao

from arxiv, 13 pages, 10 figures

We pose a new question: Can agents learn how to combine actions from previous tasks to complete new tasks, just as humans? In contrast to imitation learning, there is no expert data, only the data collected through environmental exploration. Compared with offline reinforcement learning, the problem of data distribution shift is more serious. Since the action sequence to solve the new task may be the combination of trajectory segments of multiple training tasks, in other words, the test task and the solving strategy do not exist directly in the training data. This makes the problem more difficult. We propose a Memory-related Multi-task Method (M3) to address this problem. The method consists of three stages. First, task-agnostic exploration is carried out to collect data. Different from previous methods, we organize the exploration data into a knowledge graph. We design a model based on the exploration data to extract action effect features and save them in memory, while an action predictive model is trained. Secondly, for a new task, the action effect features stored in memory are used to generate candidate actions by a feature decomposition-based approach. Finally, a multi-scale candidate action pool and the action predictive model are fused to generate a strategy to complete the task. Experimental results show that the performance of our proposed method is significantly improved compared with the baseline.

翻译：我们提出了一个新问题:代理商能否学习如何将先前任务中的行动结合起来来完成新任务,就像人类一样? 与模仿学习相比,没有专家数据,只有通过环境勘探收集的数据。与离线强化学习相比,数据分布变化的问题更为严重。由于解决新任务的行动顺序可能是多种培训任务的轨迹部分的结合, 换句话说, 测试任务和解决战略并不直接存在于培训数据中。这就使得问题更难解决。我们提出了一种与记忆有关的多任务方法( M3)来解决这个问题。这种方法由三个阶段组成。首先, 进行任务不可知性的探索以收集数据。我们将勘探数据组织成一个知识图表。我们根据勘探数据设计一个模型, 以提取行动效果特性并将其保存在记忆中, 换句话说, 行动预测模型是直接存在于记忆中的行动效果, 用于通过基于特性的分解定位方法生成候选行动。最后, 一个多尺度的候选行动组群和动作预测模型与我们提议的实验性模型有显著的结合。

0

相关内容

Learning

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

抑癌基因HOXD10及其启动子甲基化调控前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

面向DDS的自驱动Pt纳米机器人运动控制机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

MADS-RIN下游基因的鉴定及功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于化学蛋白组学策略分离和鉴定杠柳毒素作用于昆虫中肠细胞的靶蛋白

国家自然科学基金

0+阅读 · 2011年12月31日

琼玉膏延缓衰老的靶蛋白及代谢组学研究

国家自然科学基金

0+阅读 · 2011年12月31日

专利h指数与专利信息网络测度研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于基因表达谱和代谢谱分析的茶氨酸代谢机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

下一代无线通信系统自适应调制技术及跨层设计研究

国家自然科学基金

1+阅读 · 2008年12月31日

Graph Few-shot Learning with Task-specific Structures

Graph Few-shot Learning with Task-specific Structures

Arxiv

0+阅读 · 2022年10月21日

PaCo: Parameter-Compositional Multi-Task Reinforcement Learning

Arxiv

0+阅读 · 2022年10月21日

DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks

Arxiv

0+阅读 · 2022年10月20日

Learning and Retrieval from Prior Data for Skill-based Imitation Learning

Arxiv

0+阅读 · 2022年10月20日

Knowledge-based and Data-driven Reasoning and Learning for Ad Hoc Teamwork

Arxiv

0+阅读 · 2022年10月19日

Variational Model Perturbation for Source-Free Domain Adaptation

Arxiv

0+阅读 · 2022年10月19日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Graph Few-shot Learning with Task-specific Structures

Graph Few-shot Learning with Task-specific Structures

Arxiv

0+阅读 · 2022年10月21日

PaCo: Parameter-Compositional Multi-Task Reinforcement Learning

Arxiv

0+阅读 · 2022年10月21日

DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks

Arxiv

0+阅读 · 2022年10月20日

Learning and Retrieval from Prior Data for Skill-based Imitation Learning

Arxiv

0+阅读 · 2022年10月20日

Knowledge-based and Data-driven Reasoning and Learning for Ad Hoc Teamwork

Arxiv

0+阅读 · 2022年10月19日

Variational Model Perturbation for Source-Free Domain Adaptation

Arxiv

0+阅读 · 2022年10月19日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

相关基金

抑癌基因HOXD10及其启动子甲基化调控前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

面向DDS的自驱动Pt纳米机器人运动控制机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

MADS-RIN下游基因的鉴定及功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于化学蛋白组学策略分离和鉴定杠柳毒素作用于昆虫中肠细胞的靶蛋白

国家自然科学基金

0+阅读 · 2011年12月31日

琼玉膏延缓衰老的靶蛋白及代谢组学研究

国家自然科学基金

0+阅读 · 2011年12月31日

专利h指数与专利信息网络测度研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于基因表达谱和代谢谱分析的茶氨酸代谢机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

下一代无线通信系统自适应调制技术及跨层设计研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员