利用信息理论规范化进行基于学习的视觉导航 (Reinforcement Learning-based Visual Navigation with Information-Theoretic Regularization) - 专知论文

会员服务 ·

0

正则化项 · MoDELS · INFORMS · Next · 学成 ·

2022 年 1 月 10 日

Reinforcement Learning-based Visual Navigation with Information-Theoretic Regularization

翻译：利用信息理论规范化进行基于学习的视觉导航

Qiaoyun Wu,Kai Xu,Jun Wang,Mingliang Xu,Xiaoxi Gong,Dinesh Manocha

from arxiv, 11 pages, corresponding author: Kai Xu (kevin.kai.xu@gmail.com) and Jun Wang (wjun@nuaa.edu.cn)

To enhance the cross-target and cross-scene generalization of target-driven visual navigation based on deep reinforcement learning (RL), we introduce an information-theoretic regularization term into the RL objective. The regularization maximizes the mutual information between navigation actions and visual observation transforms of an agent, thus promoting more informed navigation decisions. This way, the agent models the action-observation dynamics by learning a variational generative model. Based on the model, the agent generates (imagines) the next observation from its current observation and navigation target. This way, the agent learns to understand the causality between navigation actions and the changes in its observations, which allows the agent to predict the next action for navigation by comparing the current and the imagined next observations. Cross-target and cross-scene evaluations on the AI2-THOR framework show that our method attains at least a $10\%$ improvement of average success rate over some state-of-the-art models. We further evaluate our model in two real-world settings: navigation in unseen indoor scenes from a discrete Active Vision Dataset (AVD) and continuous real-world environments with a TurtleBot.We demonstrate that our navigation model is able to successfully achieve navigation tasks in these scenarios. Videos and models can be found in the supplementary material.

翻译：为加强基于深层强化学习(RL)的目标驱动视觉导航的跨目标和跨层概括化,我们在RL目标中引入了一个信息理论规范化术语。这种正规化最大限度地扩大了导航行动和代理人视觉观测转换之间的相互信息,从而促进了更知情的导航决定。这样,代理商通过学习变异基因模型,模拟了行动观察动态。根据模型,该代理商从当前观测和导航目标中生成了下一个观测。这样,该代理商学会了理解导航行动和观测变化之间的因果关系,使代理人能够通过比较当前和想象的下一个观测预测下一个导航行动。对AI2-THOR框架的交叉目标和跨曲线评价表明,我们的方法至少比某些状态的遗传模型平均成功率提高了10美元。我们进一步评估了我们的两个现实世界环境中的模型:从离散的视觉数据集(AVD)和连续的地现实世界环境中的导航,通过在海龟导航模型和图像模型中成功地展示了我们的导航任务。

0

相关内容

正则化项

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

极化层析SAR人造目标三维重构与特征提取研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

逼近和恢复的原子范数正则化方法

国家自然科学基金

0+阅读 · 2012年12月31日

中国工业地理格局变化及其环境效应

国家自然科学基金

0+阅读 · 2012年12月31日

选择性视觉认知障碍动态变化的解剖神经机制

国家自然科学基金

0+阅读 · 2011年12月31日

高维问题和稳健性研究

国家自然科学基金

0+阅读 · 2009年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

噪声导致的耳蜗毛细胞线粒体损伤及其介导毛细胞死亡的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Reinforced Structured State-Evolution for Vision-Language Navigation

Arxiv

0+阅读 · 2022年4月20日

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

Arxiv

0+阅读 · 2022年4月19日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Arxiv

0+阅读 · 2022年4月18日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Arxiv

1+阅读 · 2022年4月15日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Reinforced Structured State-Evolution for Vision-Language Navigation

Arxiv

0+阅读 · 2022年4月20日

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

Arxiv

0+阅读 · 2022年4月19日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Arxiv

0+阅读 · 2022年4月18日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Arxiv

1+阅读 · 2022年4月15日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

极化层析SAR人造目标三维重构与特征提取研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

逼近和恢复的原子范数正则化方法

国家自然科学基金

0+阅读 · 2012年12月31日

中国工业地理格局变化及其环境效应

国家自然科学基金

0+阅读 · 2012年12月31日

选择性视觉认知障碍动态变化的解剖神经机制

国家自然科学基金

0+阅读 · 2011年12月31日

高维问题和稳健性研究

国家自然科学基金

0+阅读 · 2009年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

噪声导致的耳蜗毛细胞线粒体损伤及其介导毛细胞死亡的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员