从示范活动中学习 (Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations) - 专知论文

会员服务 ·

0

FAST · 逆强化学习 · Learning · Performance · 机器人 ·

2022 年 11 月 19 日

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

翻译：从示范活动中学习

Letian Chen,Sravan Jayanthi,Rohan Paleja,Daniel Martin,Viacheslav Zakharov,Matthew Gombolay

Learning from Demonstration (LfD) approaches empower end-users to teach robots novel tasks via demonstrations of the desired behaviors, democratizing access to robotics. However, current LfD frameworks are not capable of fast adaptation to heterogeneous human demonstrations nor the large-scale deployment in ubiquitous robotics applications. In this paper, we propose a novel LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR). Our approach (1) leverages learned strategies to construct policy mixtures for fast adaptation to new demonstrations, allowing for quick end-user personalization, (2) distills common knowledge across demonstrations, achieving accurate task inference; and (3) expands its model only when needed in lifelong deployments, maintaining a concise set of prototypical strategies that can approximate all behaviors via policy mixtures. We empirically validate that FLAIR achieves adaptability (i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency (i.e., the robot achieves sample-efficient adaptation), and scalability (i.e., the model grows sublinearly with the number of demonstrations while maintaining high performance). FLAIR surpasses benchmarks across three control tasks with an average 57% improvement in policy returns and an average 78% fewer episodes required for demonstration modeling using policy mixtures. Finally, we demonstrate the success of FLAIR in a table tennis task and find users rate FLAIR as having higher task (p<.05) and personalization (p<.05) performance.

翻译：从演示(LfD)中学习的方法使终端用户能够通过展示理想行为、使机器人进入民主化,教授机器人新任务。然而,目前的LfD框架无法快速适应各种人类演示或大规模部署无处不在的机器人应用,无法快速适应各种人类演示或大规模部署。在本文中,我们提议了一个全新的LfD框架,即快速终身适应反强化学习(FLAIR) 。我们的方法(1) 利用学习过的战略来构建政策混合物,以便快速适应新的演示,允许用户快速个人化,(2) 在整个演示中积累共同知识,实现准确的任务推导;(3) 只有在终身部署中需要时,才扩展其模型,以适应各种不同的人类演示或大规模部署。我们从经验上证实,FLIR实现了适应性(即机器人适应差异性、用户特有的任务偏好)、效率(即机器人实现样本效率适应)和可扩展性(例如,模型在演示中不断增长的次线与显示 < 任务数量,同时保持高性能回报。A最后的FAAA 成功性调整,作为我们平均任务中的平均比例的演示任务。

0

相关内容

FAST

FAST：Conference on File and Storage Technologies。 Explanation：文件和存储技术会议。 Publisher：USENIX。 SIT:http://dblp.uni-trier.de/db/conf/fast/

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

ApoE促进LRP1介导小胶质细胞内化降解Aβ的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

低强度脉冲超声促进BMSCs修复骨关节炎软骨的自噬调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

HOTAIR与DNMT1相互作用在皮肤细胞衰老中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

青少年特发性脊柱侧凸发病机制中非编码RNA的相关研究

国家自然科学基金

0+阅读 · 2012年12月31日

钛种植体表面还原响应型双基因有序控释聚电解质超薄膜的构建

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于有序球腔微/纳结构的表面增强拉曼光谱研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learning-Rate-Free Learning by D-Adaptation

Arxiv

0+阅读 · 2023年1月20日

Deep Reinforcement Learning for Power Trading

Arxiv

1+阅读 · 2023年1月19日

A Meta-Learning Approach for Software Refactoring

Arxiv

0+阅读 · 2023年1月19日

Multi-objective Software Architecture Refactoring driven by Quality Attributes

Arxiv

0+阅读 · 2023年1月18日

MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation

Arxiv

0+阅读 · 2023年1月18日

Adaptive Pattern Extraction Multi-Task Learning for Multi-Step Conversion Estimations

Arxiv

0+阅读 · 2023年1月18日

Genetic Imitation Learning by Reward Extrapolation

Arxiv

0+阅读 · 2023年1月3日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Learning-Rate-Free Learning by D-Adaptation

Arxiv

0+阅读 · 2023年1月20日

Deep Reinforcement Learning for Power Trading

Arxiv

1+阅读 · 2023年1月19日

A Meta-Learning Approach for Software Refactoring

Arxiv

0+阅读 · 2023年1月19日

Multi-objective Software Architecture Refactoring driven by Quality Attributes

Arxiv

0+阅读 · 2023年1月18日

MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation

Arxiv

0+阅读 · 2023年1月18日

Adaptive Pattern Extraction Multi-Task Learning for Multi-Step Conversion Estimations

Arxiv

0+阅读 · 2023年1月18日

Genetic Imitation Learning by Reward Extrapolation

Arxiv

0+阅读 · 2023年1月3日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

ApoE促进LRP1介导小胶质细胞内化降解Aβ的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

低强度脉冲超声促进BMSCs修复骨关节炎软骨的自噬调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

HOTAIR与DNMT1相互作用在皮肤细胞衰老中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

青少年特发性脊柱侧凸发病机制中非编码RNA的相关研究

国家自然科学基金

0+阅读 · 2012年12月31日

钛种植体表面还原响应型双基因有序控释聚电解质超薄膜的构建

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于有序球腔微/纳结构的表面增强拉曼光谱研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员