MaAST:通过语义变异器进行地图关注,促进高效视觉导航 (MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation) - 专知论文

会员服务 ·

0

Better · 注意力机制 · INFORMS · Performer · 自顶向下 ·

2021 年 3 月 21 日

MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation

翻译：MaAST:通过语义变异器进行地图关注,促进高效视觉导航

Zachary Seymour,Kowshik Thopalli,Niluthpol Mithun,Han-Pang Chiu,Supun Samarasekera,Rakesh Kumar

from arxiv, 6 pages, 5 figures, accepted at ICRA 2021

Visual navigation for autonomous agents is a core task in the fields of computer vision and robotics. Learning-based methods, such as deep reinforcement learning, have the potential to outperform the classical solutions developed for this task; however, they come at a significantly increased computational load. Through this work, we design a novel approach that focuses on performing better or comparable to the existing learning-based solutions but under a clear time/computational budget. To this end, we propose a method to encode vital scene semantics such as traversable paths, unexplored areas, and observed scene objects -- alongside raw visual streams such as RGB, depth, and semantic segmentation masks -- into a semantically informed, top-down egocentric map representation. Further, to enable the effective use of this information, we introduce a novel 2-D map attention mechanism, based on the successful multi-layer Transformer networks. We conduct experiments on 3-D reconstructed indoor PointGoal visual navigation and demonstrate the effectiveness of our approach. We show that by using our novel attention schema and auxiliary rewards to better utilize scene semantics, we outperform multiple baselines trained with only raw inputs or implicit semantic information while operating with an 80% decrease in the agent's experience.

翻译：自动代理器的视觉导航是计算机视觉和机器人领域的一项核心任务。深强化学习等基于学习的方法有可能超越为这一任务开发的古典解决方案; 但是,这些方法的计算负荷明显增加。通过这项工作,我们设计了一种新颖的方法,侧重于更好或与现有的基于学习的解决方案相仿,但以明确的时间/计算预算为基础。为此,我们提出一种方法,将重要现场语义,如可穿行路径、未探索区域以及观测到的景象物体 -- -- 连同原始视觉流,如RGB、深度和语义分解掩码等 -- -- 整合成一个自上至下以自我为中心的语义表达式典型解决方案。此外,为了能够有效地使用这些信息,我们根据成功的多层次变换网络,引入了一个新的2D地图关注机制。我们进行了3D重建的室内点目标视觉导航实验,并展示了我们的方法的有效性。我们展示了利用我们的新式的注意力和辅助奖赏, 来更好地利用现场语义学、深度和语义面分隔面面遮掩体, 我们用了80 隐含的基底基底基线, 。

0

相关内容

Better

【CVPR2021】动态区域注意卷积

专知会员服务

21+阅读 · 2021年4月2日

【AAAI2020】CompFeat:用于视频实例分割的综合特征聚合

专知会员服务

9+阅读 · 2020年12月10日

【ICML2020-天津大学】多智能体深度强化学习中的Q值路径分解

【ICML2020-天津大学】多智能体深度强化学习中的Q值路径分解

专知会员服务

81+阅读 · 2020年7月2日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

专知会员服务

108+阅读 · 2020年3月29日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

专知会员服务

98+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

语义分割中的Attention和低秩重建

语义分割中的Attention和低秩重建

极市平台

37+阅读 · 2019年9月1日

语义分割 | context relation

语义分割 | context relation

极市平台

8+阅读 · 2019年2月9日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Towards Navigation by Reasoning over Spatial Configurations

Towards Navigation by Reasoning over Spatial Configurations

Arxiv

0+阅读 · 2021年5月14日

Episodic Transformer for Vision-and-Language Navigation

Arxiv

0+阅读 · 2021年5月13日

Relation-aware Hierarchical Attention Framework for Video Question Answering

Arxiv

0+阅读 · 2021年5月13日

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

Arxiv

4+阅读 · 2020年1月11日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

Reciprocal Attention Fusion for Visual Question Answering

Arxiv

5+阅读 · 2018年5月11日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年4月29日

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

Arxiv

4+阅读 · 2017年11月27日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

【CVPR2021】动态区域注意卷积

专知会员服务

21+阅读 · 2021年4月2日

【AAAI2020】CompFeat:用于视频实例分割的综合特征聚合

专知会员服务

9+阅读 · 2020年12月10日

【ICML2020-天津大学】多智能体深度强化学习中的Q值路径分解

【ICML2020-天津大学】多智能体深度强化学习中的Q值路径分解

专知会员服务

81+阅读 · 2020年7月2日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

专知会员服务

108+阅读 · 2020年3月29日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

专知会员服务

98+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《后勤保障》最新23页

《可持续创新之路：可组合系统构建军事技术新生态》

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

相关资讯

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

语义分割中的Attention和低秩重建

语义分割中的Attention和低秩重建

极市平台

37+阅读 · 2019年9月1日

语义分割 | context relation

语义分割 | context relation

极市平台

8+阅读 · 2019年2月9日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Towards Navigation by Reasoning over Spatial Configurations

Towards Navigation by Reasoning over Spatial Configurations

Arxiv

0+阅读 · 2021年5月14日

Episodic Transformer for Vision-and-Language Navigation

Arxiv

0+阅读 · 2021年5月13日

Relation-aware Hierarchical Attention Framework for Video Question Answering

Arxiv

0+阅读 · 2021年5月13日

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

Arxiv

4+阅读 · 2020年1月11日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

Reciprocal Attention Fusion for Visual Question Answering

Arxiv

5+阅读 · 2018年5月11日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年4月29日

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

Arxiv

4+阅读 · 2017年11月27日

微信扫码咨询专知VIP会员