控制在人类注视下的模仿学习:从人类注视中学习连续控制 (Gaze Regularized Imitation Learning: Learning Continuous Control from Human Gaze) - 专知论文

会员服务 ·

0

视觉关注 · 模仿学习 · 演示 · 上下文 · 智能代理 ·

2023 年 4 月 7 日

Gaze Regularized Imitation Learning: Learning Continuous Control from Human Gaze

翻译：控制在人类注视下的模仿学习:从人类注视中学习连续控制

Ravi Kumar Thakur,MD-Nazmus Samin Sunbeam,Vinicius G. Goecks,Ellen Novoseller,Ritwik Bera,Vernon J. Lawhern,Gregory M. Gremillion,John Valasek,Nicholas R. Waytowich

Approaches for teaching learning agents via human demonstrations have been widely studied and successfully applied to multiple domains. However, the majority of imitation learning work utilizes only behavioral information from the demonstrator, i.e. which actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demonstrator is allocating visual attention, and holds the potential to improve agent performance and generalization. In this work, we propose Gaze Regularized Imitation Learning (GRIL), a novel context-aware, imitation learning architecture that learns concurrently from both human demonstrations and eye gaze to solve tasks where visual attention provides important context. We apply GRIL to a visual navigation task, in which an unmanned quadrotor is trained to search for and navigate to a target vehicle in a photorealistic simulated environment. We show that GRIL outperforms several state-of-the-art gaze-based imitation learning algorithms, simultaneously learns to predict human visual attention, and generalizes to scenarios not present in the training data. Supplemental videos can be found at project https://sites.google.com/view/gaze-regularized-il/ and code at https://github.com/ravikt/gril.

翻译：利用人类演示教学智能代理的方法已被广泛研究并成功应用于多个领域。然而，大多数模仿学习工作仅利用演示者的行为信息，即采取了哪些行动，并忽略了其他有用的信息。特别地，眼球注视信息可以为了解演示者的视觉关注点提供宝贵的洞察力，并有潜力提高代理的性能和泛化能力。在这项研究中，我们提出了Gaze Regularized Imitation Learning（GRIL），这是一种新的上下文感知，模仿学习的架构，它同时从人类演示和眼球注视中学习，以解决视觉关注点提供重要上下文的任务。我们将GRIL应用于视觉导航任务，在这个任务中，一个未被控制的四旋翼被训练在逼真的模拟环境中搜索并导航到目标车辆。我们展示了GRIL优于几种最先进的基于注视的模仿学习算法，同时学习预测人类视觉关注，并且推广到训练数据中不存在的场景。项目中可以找到补充视频https://sites.google.com/view/gaze-regularized-il/和代码https://github.com/ravikt/gril。

0

相关内容

视觉关注

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

基于人脑行为调控机理的移动机器人智能控制方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

限制性漂浮基座上多连杆臂的接触动力学建模与鲁棒控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

初级感觉皮层在高级认知功能中神经生理机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于学习人类策略的动态稳定系统控制器切换方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

重读对口语加工中“时间选择性注意”的调控及其认知神经基础

国家自然科学基金

0+阅读 · 2012年12月31日

第二语言学习个体差异的神经机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于灵长目动物视觉认知原理的旋翼无人机自主规划方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

虚拟人的连续运动控制研究

国家自然科学基金

2+阅读 · 2011年12月31日

选择性注意的认知神经机制与计算模型

国家自然科学基金

0+阅读 · 2009年12月31日

Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation

Arxiv

0+阅读 · 2023年5月26日

INVICTUS: Optimizing Boolean Logic Circuit Synthesis via Synergistic Learning and Search

Arxiv

0+阅读 · 2023年5月25日

Learning Safety Constraints from Demonstrations with Unknown Rewards

Arxiv

0+阅读 · 2023年5月25日

DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月25日

A Mini Review on the utilization of Reinforcement Learning with OPC UA

Arxiv

0+阅读 · 2023年5月24日

Learning from demonstrations: An intuitive VR environment for imitation learning of construction robots

Arxiv

0+阅读 · 2023年5月23日

Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

Arxiv

19+阅读 · 2022年1月5日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation

Arxiv

0+阅读 · 2023年5月26日

INVICTUS: Optimizing Boolean Logic Circuit Synthesis via Synergistic Learning and Search

Arxiv

0+阅读 · 2023年5月25日

Learning Safety Constraints from Demonstrations with Unknown Rewards

Arxiv

0+阅读 · 2023年5月25日

DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月25日

A Mini Review on the utilization of Reinforcement Learning with OPC UA

Arxiv

0+阅读 · 2023年5月24日

Learning from demonstrations: An intuitive VR environment for imitation learning of construction robots

Arxiv

0+阅读 · 2023年5月23日

Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

Arxiv

19+阅读 · 2022年1月5日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

相关基金

基于人脑行为调控机理的移动机器人智能控制方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

限制性漂浮基座上多连杆臂的接触动力学建模与鲁棒控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

初级感觉皮层在高级认知功能中神经生理机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于学习人类策略的动态稳定系统控制器切换方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

重读对口语加工中“时间选择性注意”的调控及其认知神经基础

国家自然科学基金

0+阅读 · 2012年12月31日

第二语言学习个体差异的神经机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于灵长目动物视觉认知原理的旋翼无人机自主规划方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

虚拟人的连续运动控制研究

国家自然科学基金

2+阅读 · 2011年12月31日

选择性注意的认知神经机制与计算模型

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员