以几何方式引导空间-时注意,以一致、自上、自上、单心深度估计 (Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent Self-Supervised Monocular Depth Estimation)

Inferring geometrically consistent dense 3D scenes across a tuple of temporally consecutive images remains challenging for self-supervised monocular depth prediction pipelines. This paper explores how the increasingly popular transformer architecture, together with novel regularized loss formulations, can improve depth consistency while preserving accuracy. We propose a spatial attention module that correlates coarse depth predictions to aggregate local geometric information. A novel temporal attention mechanism further processes the local geometric information in a global context across consecutive images. Additionally, we introduce geometric constraints between frames regularized by photometric cycle consistency. By combining our proposed regularization and the novel spatial-temporal-attention module we fully leverage both the geometric and appearance-based consistency across monocular frames. This yields geometrically meaningful attention and improves temporal depth stability and accuracy compared to previous methods.

翻译：本文探讨日益流行的变压器结构,加上新型的常规损失配方,如何在保持准确性的同时提高深度一致性。我们提议了一个空间关注模块,将粗微的深度预测与汇总当地几何信息联系起来。一个新的时间关注机制进一步在全球范围内处理连续图像中的当地几何信息。此外,我们引入了以光度周期一致性规范化的框架之间的几何限制。通过将我们提议的正规化和新的空间时空注意模块结合起来,我们充分利用了单眼框架的几何和外观一致性。这产生了几何上有意义的关注,提高了与以往方法相比的时间深度稳定性和准确性。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【NeurIPS 2021】流形上的注意力机制：规范等变的Transformer

专知会员服务

30+阅读 · 2021年12月2日

【ICCV2021】域分离的全时段自监督单目深度估计

专知会员服务

7+阅读 · 2021年9月22日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日