MEGA-Dagger:多重不合格专家的模拟学习</s> (MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts) - 专知论文

会员服务 ·

0

INTERACT · Learning · 协变量偏移 · state-of-the-art · HTTPS ·

2023 年 3 月 5 日

MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts

翻译：MEGA-Dagger:多重不合格专家的模拟学习

Xiatao Sun,Shuo Yang,Rahul Mangharam

Imitation learning has been widely applied to various autonomous systems thanks to recent development in interactive algorithms that address covariate shift and compounding errors induced by traditional approaches like behavior cloning. However, existing interactive imitation learning methods assume access to one perfect expert. Whereas in reality, it is more likely to have multiple imperfect experts instead. In this paper, we propose MEGA-DAgger, a new DAgger variant that is suitable for interactive learning with multiple imperfect experts. First, unsafe demonstrations are filtered while aggregating the training data, so the imperfect demonstrations have little influence when training the novice policy. Next, experts are evaluated and compared on scenarios-specific metrics to resolve the conflicted labels among experts. Through experiments in autonomous racing scenarios, we demonstrate that policy learned using MEGA-DAgger can outperform both experts and policies learned using the state-of-the-art interactive imitation learning algorithm. The supplementary video can be found at https://youtu.be/pYQiPSHk6dU.

翻译：由于最近开发了互动算法,以解决诸如行为克隆等传统方法引起的共变转移和复合错误,对各种自主系统广泛应用了模拟学习。然而,现有的互动模拟学习方法假定了一名完美的专家。在现实中,它更有可能有多重不完善的专家。在本文中,我们建议采用适合与多重不完善专家进行互动学习的新的Dagger变体MEGA-Dagger。首先,在合并培训数据的同时,对不安全的演示进行过滤,因此,不完善的演示在培训新政策时影响不大。接下来,对专家进行具体情景评估,比较解决专家之间相互冲突标签的参数。我们通过在自主竞赛情景中的实验,证明使用MEGA-Dagger所学的政策可以超越使用最先进的交互式模拟学习算法所学的专家和政策。补充视频可在https://youtu.be/pYQIPSHk6dU上找到。</s>

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

囊泡靶向通路介导的自噬在脊髓损伤后锌毒性中的保护作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

PCL聚合物纳米粒子控释HIF-1α诱导OSTERIX修饰的iPS细胞成骨作用及再血管化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

三江中段夏塞银铅锌矿床微量元素富集机制及对成矿过程的指示：硫化物微区分析

国家自然科学基金

0+阅读 · 2012年12月31日

隧道石膏质围岩水分传输与劣化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化铈在镍钴钨合金粉末成形及高温耐磨涂层中作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-145/PAK4/LIMK1调控通路介导结直肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

仿人机器人高效平滑抓取物体的控制策略研究

国家自然科学基金

0+阅读 · 2011年12月31日

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月26日

Learning Task-Specific Strategies for Accelerated MRI

Arxiv

0+阅读 · 2023年4月25日

When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月24日

Causal Inference under Temporal and Spatial Interference

Arxiv

0+阅读 · 2023年4月24日

Natural Evolution Strategy for Mixed-Integer Black-Box Optimization

Arxiv

0+阅读 · 2023年4月21日

Reinforcement Learning Approaches for Traffic Signal Control under Missing Data

Arxiv

0+阅读 · 2023年4月21日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

协变量偏移

state-of-the-art

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月26日

Learning Task-Specific Strategies for Accelerated MRI

Arxiv

0+阅读 · 2023年4月25日

When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月24日

Causal Inference under Temporal and Spatial Interference

Arxiv

0+阅读 · 2023年4月24日

Natural Evolution Strategy for Mixed-Integer Black-Box Optimization

Arxiv

0+阅读 · 2023年4月21日

Reinforcement Learning Approaches for Traffic Signal Control under Missing Data

Arxiv

0+阅读 · 2023年4月21日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

囊泡靶向通路介导的自噬在脊髓损伤后锌毒性中的保护作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

PCL聚合物纳米粒子控释HIF-1α诱导OSTERIX修饰的iPS细胞成骨作用及再血管化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

三江中段夏塞银铅锌矿床微量元素富集机制及对成矿过程的指示：硫化物微区分析

国家自然科学基金

0+阅读 · 2012年12月31日

隧道石膏质围岩水分传输与劣化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化铈在镍钴钨合金粉末成形及高温耐磨涂层中作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-145/PAK4/LIMK1调控通路介导结直肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

仿人机器人高效平滑抓取物体的控制策略研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员