TransVPR:基于变异器的地点确认,多层次关注汇总 (TransVPR: Transformer-based place recognition with multi-level attention aggregation) - 专知论文

会员服务 ·

0

Performer · 注意力机制 · INFORMS · 变换 · state-of-the-art ·

2022 年 1 月 6 日

TransVPR: Transformer-based place recognition with multi-level attention aggregation

翻译：TransVPR:基于变异器的地点确认,多层次关注汇总

Ruotong Wang,Yanqing Shen,Weiliang Zuo,Sanping Zhou,Nanning Zhen

Visual place recognition is a challenging task for applications such as autonomous driving navigation and mobile robot localization. Distracting elements presenting in complex scenes often lead to deviations in the perception of visual place. To address this problem, it is crucial to integrate information from only task-relevant regions into image representations. In this paper, we introduce a novel holistic place recognition model, TransVPR, based on vision Transformers. It benefits from the desirable property of the self-attention operation in Transformers which can naturally aggregate task-relevant features. Attentions from multiple levels of the Transformer, which focus on different regions of interest, are further combined to generate a global image representation. In addition, the output tokens from Transformer layers filtered by the fused attention mask are considered as key-patch descriptors, which are used to perform spatial matching to re-rank the candidates retrieved by the global image features. The whole model allows end-to-end training with a single objective and image-level supervision. TransVPR achieves state-of-the-art performance on several real-world benchmarks while maintaining low computational time and storage requirements.

翻译：对自主驾驶导航和移动机器人定位等应用而言,视觉位置识别是一项艰巨的任务。在复杂场景中呈现的扰动元素往往导致视觉位置认知的偏差。为了解决这一问题,将仅与任务相关的区域的信息整合到图像展示中至关重要。在本文中,我们引入了一个新的整体位置识别模型TransVPR,以愿景变异器为基础。它得益于在变异器中进行自我关注操作的可取特性,这种操作可以自然地将任务相关特性综合在一起。多层次的变异器的注意力进一步组合起来,从而产生全球图像显示。此外,由装配式注意面罩过滤的变异器层的输出符号被视为关键端码解记器,用于进行空间匹配,以重新排列通过全球图像特征检索的候选人。整个模型允许在单一客观和图像层面的监管下进行端对端培训。 TransVPR在保持低计算时间和存储要求的同时,在几个真实世界基准上取得了最新表现。

0

相关内容

Performer

【MM 2021】基于自监督区域和时序辅助任务的面部运动单元识别，Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition

【MM 2021】基于自监督区域和时序辅助任务的面部运动单元识别，Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition

专知会员服务

4+阅读 · 2022年3月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

应变梯度对铁电材料力电耦合性能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Ta2O5-WO3-RxOy系统相关系及TaW基抗氧化合金组分优化

国家自然科学基金

0+阅读 · 2014年12月31日

活性氧对Cdc25c蛋白的翻译后修饰在新型天然产物PP31J抗肿瘤活性中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于成像机理的遥感影像融合研究

国家自然科学基金

2+阅读 · 2012年12月31日

社交网络信息传播与演化机理研究

国家自然科学基金

6+阅读 · 2012年12月31日

基于围岩分区破裂的巷道冲击地压演化时空序列研究

国家自然科学基金

1+阅读 · 2012年12月31日

QBO影响和调制东亚冬季风的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

面向芯片智能固化过程的时空多模型监控系统

国家自然科学基金

0+阅读 · 2011年12月31日

基于提升多小波的航空发动机早期耦合故障诊断技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

GIMO: Gaze-Informed Human Motion Prediction in Context

Arxiv

1+阅读 · 2022年4月20日

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

Arxiv

0+阅读 · 2022年4月19日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image

Arxiv

0+阅读 · 2022年4月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

VIP会员

文章信息

相关主题

注意力机制

state-of-the-art

相关VIP内容

【MM 2021】基于自监督区域和时序辅助任务的面部运动单元识别，Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition

【MM 2021】基于自监督区域和时序辅助任务的面部运动单元识别，Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition

专知会员服务

4+阅读 · 2022年3月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

相关论文

GIMO: Gaze-Informed Human Motion Prediction in Context

Arxiv

1+阅读 · 2022年4月20日

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

Arxiv

0+阅读 · 2022年4月19日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image

Arxiv

0+阅读 · 2022年4月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

相关基金

应变梯度对铁电材料力电耦合性能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Ta2O5-WO3-RxOy系统相关系及TaW基抗氧化合金组分优化

国家自然科学基金

0+阅读 · 2014年12月31日

活性氧对Cdc25c蛋白的翻译后修饰在新型天然产物PP31J抗肿瘤活性中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于成像机理的遥感影像融合研究

国家自然科学基金

2+阅读 · 2012年12月31日

社交网络信息传播与演化机理研究

国家自然科学基金

6+阅读 · 2012年12月31日

基于围岩分区破裂的巷道冲击地压演化时空序列研究

国家自然科学基金

1+阅读 · 2012年12月31日

QBO影响和调制东亚冬季风的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

面向芯片智能固化过程的时空多模型监控系统

国家自然科学基金

0+阅读 · 2011年12月31日

基于提升多小波的航空发动机早期耦合故障诊断技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员