泛化与丢失感知：利用广泛离线数据学习视觉动作任务 (Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks) - 专知论文

会员服务 ·

0

泛数据 · 泛化 · 微调 · 脱机 · 感知模型 ·

2023 年 4 月 18 日

Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks

翻译：泛化与丢失感知：利用广泛离线数据学习视觉动作任务

Kuan Fang,Patrick Yin,Ashvin Nair,Homer Walke,Gengchen Yan,Sergey Levine

from arxiv, CoRL 2022

The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields. However, how to effectively make use of diverse multi-task data for novel downstream tasks still remains a grand challenge in robotics. To tackle this challenge, we introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data, in combination with online fine-tuning guided by subgoals in learned lossy representation space. When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems. Learned from the broad data, the lossy representation emphasizes task-relevant information about states and goals while abstracting away redundant contexts that hinder generalization. It thus enables subgoal planning for unseen tasks, provides a compact input to the policy, and facilitates reward shaping during fine-tuning. We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.

翻译：利用广泛数据集对于在各个领域实现泛化至关重要。然而，如何有效地利用不同的多任务数据来学习新的下游任务仍然是机器人领域面临的重要挑战。为了解决这个问题，我们引入了一个框架，在广泛的数据上进行脱机强化学习，结合基于学习的丢失感知表示空间中子目标引导的在线微调，从而获得针对未见过的时间延长任务的目标条件策略。当面临新的任务目标时，该框架使用一个所谓的感知模型来规划代表序列，将原始任务分解为更容易解决的问题。利用广泛数据学习到的丢失感知表示突出了与状态和目标相关的信息，同时抽象掉妨碍泛化的多余上下文。它能够实现针对未见过任务的子目标规划，提供压缩的策略输入，并在微调过程中促进奖励形状。我们证明，我们的框架可以在视觉输入完全没有手动奖励工程的情况下，使用之前的机器人经验的大规模数据集进行预训练，并且可以有效地微调用于新任务。

0

相关内容

泛数据

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

专知会员服务

36+阅读 · 2020年5月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

专知会员服务

21+阅读 · 2020年2月11日

【基于元学习的推荐系统】5篇相关论文

专知会员服务

88+阅读 · 2020年1月20日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于压缩感知的滚动轴承稀疏特征提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

机器人节律运动控制框架模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维模型压缩感知与快速恢复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于深度相机的动态模型三维扫描重建

国家自然科学基金

0+阅读 · 2012年12月31日

基于约会规划和信息势的传感网低能耗移动数据收集问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载INS/WSN/机器视觉组合导航鲁棒滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于点云数据和运动捕捉数据的三维人体建模及动画仿真

国家自然科学基金

1+阅读 · 2012年12月31日

长时程记忆的表观遗传学调节机制

国家自然科学基金

0+阅读 · 2011年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Arxiv

0+阅读 · 2023年6月1日

Robust T-Loss for Medical Image Segmentation

Arxiv

0+阅读 · 2023年6月1日

Too Large; Data Reduction for Vision-Language Pre-Training

Arxiv

0+阅读 · 2023年6月1日

Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

Arxiv

0+阅读 · 2023年6月1日

Efficient Online Reinforcement Learning with Offline Data

Arxiv

0+阅读 · 2023年5月31日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

专知会员服务

36+阅读 · 2020年5月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

专知会员服务

21+阅读 · 2020年2月11日

【基于元学习的推荐系统】5篇相关论文

专知会员服务

88+阅读 · 2020年1月20日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Arxiv

0+阅读 · 2023年6月1日

Robust T-Loss for Medical Image Segmentation

Arxiv

0+阅读 · 2023年6月1日

Too Large; Data Reduction for Vision-Language Pre-Training

Arxiv

0+阅读 · 2023年6月1日

Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

Arxiv

0+阅读 · 2023年6月1日

Efficient Online Reinforcement Learning with Offline Data

Arxiv

0+阅读 · 2023年5月31日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于压缩感知的滚动轴承稀疏特征提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

机器人节律运动控制框架模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维模型压缩感知与快速恢复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于深度相机的动态模型三维扫描重建

国家自然科学基金

0+阅读 · 2012年12月31日

基于约会规划和信息势的传感网低能耗移动数据收集问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载INS/WSN/机器视觉组合导航鲁棒滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于点云数据和运动捕捉数据的三维人体建模及动画仿真

国家自然科学基金

1+阅读 · 2012年12月31日

长时程记忆的表观遗传学调节机制

国家自然科学基金

0+阅读 · 2011年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员