DwinFormer: 用于最终至最终单体深度估算的双窗口变形器</s> (DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation) - 专知论文

会员服务 ·

0

Microsoft Windows · 估计/估计量 · 端到端 · 解码 · Networking ·

2023 年 3 月 6 日

DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation

翻译：DwinFormer: 用于最终至最终单体深度估算的双窗口变形器

Md Awsafur Rahman,Shaikh Anowarul Fattah

Depth estimation from a single image is of paramount importance in the realm of computer vision, with a multitude of applications. Conventional methods suffer from the trade-off between consistency and fine-grained details due to the local-receptive field limiting their practicality. This lack of long-range dependency inherently comes from the convolutional neural network part of the architecture. In this paper, a dual window transformer-based network, namely DwinFormer, is proposed, which utilizes both local and global features for end-to-end monocular depth estimation. The DwinFormer consists of dual window self-attention and cross-attention transformers, Dwin-SAT and Dwin-CAT, respectively. The Dwin-SAT seamlessly extracts intricate, locally aware features while concurrently capturing global context. It harnesses the power of local and global window attention to adeptly capture both short-range and long-range dependencies, obviating the need for complex and computationally expensive operations, such as attention masking or window shifting. Moreover, Dwin-SAT introduces inductive biases which provide desirable properties, such as translational equvariance and less dependence on large-scale data. Furthermore, conventional decoding methods often rely on skip connections which may result in semantic discrepancies and a lack of global context when fusing encoder and decoder features. In contrast, the Dwin-CAT employs both local and global window cross-attention to seamlessly fuse encoder and decoder features with both fine-grained local and contextually aware global information, effectively amending semantic gap. Empirical evidence obtained through extensive experimentation on the NYU-Depth-V2 and KITTI datasets demonstrates the superiority of the proposed method, consistently outperforming existing approaches across both indoor and outdoor environments.

翻译：DwinFormer 是一个基于双窗口变压器的网络, 即 DwinFormer, 它在计算机视野领域具有至关重要的意义, 并且有许多应用。 DwinFormer 由两个窗口的自控和超端变压器、 Dwin- SAT 和 Dwin-CAT 分别构成。 Dwin- SAT 在同时捕捉全球环境的同时, 也从复杂的、本地意识的特征中提取。它利用当地和全球窗口的注意力, 从而恰当地捕捉到短距离和长距离的依赖性, 从而忽略了对复杂和计算成本昂贵的操作的需求, 比如, 掩盖或改变地方内部的深度。 DwinFormer 由双窗口自控和双端的双端变压变压器、 Dwin- SAT 和 Dwin- CAT 分别构成。 Dwin- 双向SAT 在同时捕捉取全球环境环境的同时, 在本地有清晰感知觉的特性的同时, 将精密性地取出本地的特性。它利用本地和远端变压式变压法,, 使得全球的变形变形变形变形法和不为大的数据。</s>

0

相关内容

Microsoft Windows

Microsoft Windows

Microsoft Windows（视窗操作系统）是微软公司推出的一系列操作系统。它问世于1985年，当时是DOS之下的操作环境，而后其后续版本作逐渐发展成为个人电脑和服务器用户设计的操作系统。

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

石墨烯量子点/WO3超薄纳米片异质结的制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于tmTNF-α mAb磁分离微流控技术的乳腺癌转移超早期预警研究

国家自然科学基金

0+阅读 · 2014年12月31日

界面结构对小周期金属多层膜塑性变形机理的影响

国家自然科学基金

0+阅读 · 2013年12月31日

荧光石墨烯量子点制备新方法及其在羟基自由基检测中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于量子点增强金纳米孔洞阵列表面等离子共振的高灵敏度生物检测

国家自然科学基金

0+阅读 · 2013年12月31日

高效DC细胞靶向性DNA疫苗抗HPV感染的免疫效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

异步低功耗LDPC解码器设计

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey

Arxiv

0+阅读 · 2023年4月26日

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

Arxiv

0+阅读 · 2023年4月25日

Depth-Relative Self Attention for Monocular Depth Estimation

Arxiv

0+阅读 · 2023年4月25日

Fully Sparse Fusion for 3D Object Detection

Arxiv

0+阅读 · 2023年4月25日

STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multi-scale MLP for Medical Image Segmentation

Arxiv

0+阅读 · 2023年4月25日

SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain Knowledge

Arxiv

0+阅读 · 2023年4月25日

TOT: Topology-Aware Optimal Transport For Multimodal Hate Detection

Arxiv

0+阅读 · 2023年4月24日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

VIP会员

文章信息

相关主题

Microsoft Windows

估计/估计量

相关VIP内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《全谱战争——从拓宽工具到思考不可思考之事》

《FPV武装无人机的战斗飞行艺术与科学》最新报告

无人机作战：演进、创新与未来战场

《反无人机：用于无人机探测与定位的多输入多输出雷达》最新69页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey

Arxiv

0+阅读 · 2023年4月26日

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

Arxiv

0+阅读 · 2023年4月25日

Depth-Relative Self Attention for Monocular Depth Estimation

Arxiv

0+阅读 · 2023年4月25日

Fully Sparse Fusion for 3D Object Detection

Arxiv

0+阅读 · 2023年4月25日

STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multi-scale MLP for Medical Image Segmentation

Arxiv

0+阅读 · 2023年4月25日

SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain Knowledge

Arxiv

0+阅读 · 2023年4月25日

TOT: Topology-Aware Optimal Transport For Multimodal Hate Detection

Arxiv

0+阅读 · 2023年4月24日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

相关基金

石墨烯量子点/WO3超薄纳米片异质结的制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于tmTNF-α mAb磁分离微流控技术的乳腺癌转移超早期预警研究

国家自然科学基金

0+阅读 · 2014年12月31日

界面结构对小周期金属多层膜塑性变形机理的影响

国家自然科学基金

0+阅读 · 2013年12月31日

荧光石墨烯量子点制备新方法及其在羟基自由基检测中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于量子点增强金纳米孔洞阵列表面等离子共振的高灵敏度生物检测

国家自然科学基金

0+阅读 · 2013年12月31日

高效DC细胞靶向性DNA疫苗抗HPV感染的免疫效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

异步低功耗LDPC解码器设计

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员