RGB-D耐用性探测变换器网络 (Transformer-based Network for RGB-D Saliency Detection) - 专知论文

会员服务 ·

0

RGB-D · INFORMS · Integration · 缩放 · Networking ·

2021 年 12 月 1 日

Transformer-based Network for RGB-D Saliency Detection

翻译：RGB-D耐用性探测变换器网络

Yue Wang,Xu Jia,Lu Zhang,Yuke Li,James Elder,Huchuan Lu

RGB-D saliency detection integrates information from both RGB images and depth maps to improve prediction of salient regions under challenging conditions. The key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities. Previous approaches tend to apply the multi-scale and multi-modal fusion separately via local operations, which fails to capture long-range dependencies. Here we propose a transformer-based network to address this issue. Our proposed architecture is composed of two modules: a transformer-based within-modality feature enhancement module (TWFEM) and a transformer-based feature fusion module (TFFM). TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously. TWFEM enhances feature on each scale by selecting and integrating complementary information from other scales within the same modality before TFFM. We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement, and simplifies the model design. Extensive experimental results on six benchmark datasets demonstrate that our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.

翻译：RGB-显要性探测将来自RGB图像和深度地图的信息结合起来,以改进在具有挑战性的条件下对显要地区的预测。RGB-显要性探测的关键是,在两种模式的多个尺度上完全开采和集成信息。以前的方法倾向于通过当地作业分别使用多尺度和多模式的聚合,而当地作业未能捕捉长距离依赖性。我们在这里建议一个基于变压器的网络来解决这一问题。我们提议的结构由两个模块组成:一个基于变压器的内调特点增强模块(TWFEM)和一个基于变压器的特征融合模块(TFFM)。TFFM通过同时将多个尺度和所有位置的两种模式结合起来,进行充分的特征融合。TFEM通过选择和整合在TFMFM之前同一模式内其他尺度中的补充性信息,提高每个尺度的特征。我们表明变压器是一种统一的运作,在特性融合和特性增强方面都具有极大的功效,并且简化了模型设计。关于六个基准数据集的广泛实验结果表明,我们提议的网络对州测得式RD的RD方法的有利性。

0

相关内容

RGB-D

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

专知会员服务

30+阅读 · 2021年10月13日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【港中文CMSC5743】深度神经网络高效计算

专知会员服务

32+阅读 · 2020年10月9日

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

专知会员服务

45+阅读 · 2020年4月17日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【CIKM2019 Tutorial】Realtime object detection via deep learning-based pipelines(通过基于深度学习的管道实现实时对象检测)，附教程PDF免费下载

【CIKM2019 Tutorial】Realtime object detection via deep learning-based pipelines(通过基于深度学习的管道实现实时对象检测)，附教程PDF免费下载

专知会员服务

19+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

视频目标检测：Flow-based

视频目标检测：Flow-based

极市平台

22+阅读 · 2019年5月27日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Keras实现基于MSCNN的人群计数

Keras实现基于MSCNN的人群计数

AI科技评论

8+阅读 · 2019年2月11日

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

泡泡机器人SLAM

4+阅读 · 2019年1月14日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Multimodal Virtual Point 3D Detection

Arxiv

6+阅读 · 2021年11月12日

OadTR: Online Action Detection with Transformers

Arxiv

7+阅读 · 2021年6月21日

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Arxiv

4+阅读 · 2020年12月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

Pooling Pyramid Network for Object Detection

Arxiv

6+阅读 · 2018年7月9日

3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection

Arxiv

8+阅读 · 2018年2月21日

Agile Amulet: Real-Time Salient Object Detection with Contextual Attention

Arxiv

5+阅读 · 2018年2月20日

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Arxiv

21+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

相关VIP内容

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

专知会员服务

30+阅读 · 2021年10月13日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【港中文CMSC5743】深度神经网络高效计算

专知会员服务

32+阅读 · 2020年10月9日

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

专知会员服务

45+阅读 · 2020年4月17日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【CIKM2019 Tutorial】Realtime object detection via deep learning-based pipelines(通过基于深度学习的管道实现实时对象检测)，附教程PDF免费下载

【CIKM2019 Tutorial】Realtime object detection via deep learning-based pipelines(通过基于深度学习的管道实现实时对象检测)，附教程PDF免费下载

专知会员服务

19+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

视频目标检测：Flow-based

视频目标检测：Flow-based

极市平台

22+阅读 · 2019年5月27日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Keras实现基于MSCNN的人群计数

Keras实现基于MSCNN的人群计数

AI科技评论

8+阅读 · 2019年2月11日

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

泡泡机器人SLAM

4+阅读 · 2019年1月14日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Multimodal Virtual Point 3D Detection

Arxiv

6+阅读 · 2021年11月12日

OadTR: Online Action Detection with Transformers

Arxiv

7+阅读 · 2021年6月21日

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Arxiv

4+阅读 · 2020年12月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

Pooling Pyramid Network for Object Detection

Arxiv

6+阅读 · 2018年7月9日

3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection

Arxiv

8+阅读 · 2018年2月21日

Agile Amulet: Real-Time Salient Object Detection with Contextual Attention

Arxiv

5+阅读 · 2018年2月20日

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Arxiv

21+阅读 · 2018年1月12日

微信扫码咨询专知VIP会员