从示范活动中学习 (Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations) - 专知论文

会员服务 ·

0

FAST · 逆强化学习 · Learning · 机器人 · MoDELS ·

2022 年 9 月 24 日

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

翻译：从示范活动中学习

Letian Chen,Sravan Jayanthi,Rohan Paleja,Daniel Martin,Viacheslav Zakharov,Matthew Gombolay

Learning from Demonstration (LfD) approaches empower end-users to teach robots novel tasks via demonstrations of the desired behaviors, democratizing access to robotics. However, current LfD frameworks are not capable of fast adaptation to heterogeneous human demonstrations nor the large-scale deployment in ubiquitous robotics applications. In this paper, we propose a novel LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR). Our approach (1) leverages learned strategies to construct policy mixtures for fast adaptation to new demonstrations, allowing for quick end-user personalization; (2) distills common knowledge across demonstrations, achieving accurate task inference; and (3) expands its model only when needed in lifelong deployments, maintaining a concise set of prototypical strategies that can approximate all behaviors via policy mixtures. We empirically validate that FLAIR achieves adaptability (i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency (i.e., the robot achieves sample-efficient adaptation), and scalability (i.e., the model grows sublinearly with the number of demonstrations while maintaining high performance). FLAIR surpasses benchmarks across three continuous control tasks with an average 57% improvement in policy returns and an average 78% fewer episodes required for demonstration modeling using policy mixtures. Finally, we demonstrate the success of FLAIR in a real-robot table tennis task.

翻译：从演示(LfD)中学习的方法使终端用户能够通过展示理想行为、实现机器人进入的民主化,向机器人传授新任务。然而,目前的LfD框架无法快速适应各种人类演示或大规模部署无处不在的机器人应用,无法快速适应各种人类演示或大规模部署。在本文中,我们提议了一个全新的LfD框架,即快速寿命适应性反强化学习(FLAIR) 。我们的方法(1) 利用学习后的战略来构建政策混合物,以便快速适应新的演示,允许用户快速个人化;(2) 在整个演示中积累共同知识,实现准确的任务推导;(3) 只有在终身部署中需要时,才扩展其模式,维持一套简明的原型战略,通过政策混合物可以接近所有的行为。我们从经验上证实,FLAIR实现了适应性(即机器人适应差异性、用户特有的任务偏好)、效率(即机器人实现样本高效适应性适应性适应)以及可扩展性(即我们在57号网上用户个人化的适应性适应性)和可扩展性(即模型与演示的次线次模型增长,在57号模型中增加演示数量,同时保持高性A 平均政策回报,在FAA 上,在超过要求的平均回报上,在FA 上,在超过一个平均政策上,在超过一个标准上,在FAA 上,在超过一个标准上显示一个标准上,在超过一个标准。

0

相关内容

FAST

FAST：Conference on File and Storage Technologies。 Explanation：文件和存储技术会议。 Publisher：USENIX。 SIT:http://dblp.uni-trier.de/db/conf/fast/

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

面向跨领域异构数据的患者相似性学习方法及应用

国家自然科学基金

23+阅读 · 2016年12月31日

滋补肝肾法调控黑素细胞氧化损伤致凋亡线粒体通路的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Beclin 1-VPS34复合体对神经细胞内β淀粉样蛋白稳态的调控作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

早年应激与nectin-afadin系统调控海马环路发育与可塑性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

HEV感染致脑组织损伤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Beclin 1在阿尔茨海默病样神经元损伤中的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于线粒体蛋白质组学的针刺治疗阿尔茨海默病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

原位诱导反应性星型胶质细胞增殖分化促进脊髓神经功能恢复的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于流行学研究方法的阿尔茨海默病（AD）中医证候研究

国家自然科学基金

0+阅读 · 2009年12月31日

Adaptive Selection of the Optimal Strategy to Improve Precision and Power in Randomized Trials

Arxiv

0+阅读 · 2022年10月31日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

Arxiv

0+阅读 · 2022年10月30日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2022年10月29日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Arxiv

12+阅读 · 2020年12月14日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Adaptive Selection of the Optimal Strategy to Improve Precision and Power in Randomized Trials

Arxiv

0+阅读 · 2022年10月31日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

Arxiv

0+阅读 · 2022年10月30日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2022年10月29日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Arxiv

12+阅读 · 2020年12月14日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

相关基金

面向跨领域异构数据的患者相似性学习方法及应用

国家自然科学基金

23+阅读 · 2016年12月31日

滋补肝肾法调控黑素细胞氧化损伤致凋亡线粒体通路的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Beclin 1-VPS34复合体对神经细胞内β淀粉样蛋白稳态的调控作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

早年应激与nectin-afadin系统调控海马环路发育与可塑性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

HEV感染致脑组织损伤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Beclin 1在阿尔茨海默病样神经元损伤中的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于线粒体蛋白质组学的针刺治疗阿尔茨海默病的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

原位诱导反应性星型胶质细胞增殖分化促进脊髓神经功能恢复的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于流行学研究方法的阿尔茨海默病（AD）中医证候研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员