For a monocular 360 image, depth estimation is a challenging because the distortion increases along the latitude. To perceive the distortion, existing methods devote to designing a deep and complex network architecture. In this paper, we provide a new perspective that constructs an interpretable and sparse representation for a 360 image. Considering the importance of the geometric structure in depth estimation, we utilize the contourlet transform to capture an explicit geometric cue in the spectral domain and integrate it with an implicit cue in the spatial domain. Specifically, we propose a neural contourlet network consisting of a convolutional neural network and a contourlet transform branch. In the encoder stage, we design a spatial-spectral fusion module to effectively fuse two types of cues. Contrary to the encoder, we employ the inverse contourlet transform with learned low-pass subbands and band-pass directional subbands to compose the depth in the decoder. Experiments on the three popular panoramic image datasets demonstrate that the proposed approach outperforms the state-of-the-art schemes with faster convergence. Code is available at https://github.com/zhijieshen-bjtu/Neural-Contourlet-Network-for-MODE.
翻译:对于单眼360图像来说,深度估计是一个挑战,因为扭曲在纬度上增加。 观察扭曲, 现有方法用于设计深而复杂的网络结构。 在本文中, 我们提供了一个新视角, 为360图像构建一个可解释和稀少的表达方式。 考虑到深度估计中几何结构的重要性, 我们利用轮廓变换来捕捉光光谱域中的清晰几何信号, 并在空间域中将其与隐含的提示整合。 具体地说, 我们提议建立一个由共振神经网络和轮廓变形分支组成的神经等轮廓网络。 在编码阶段, 我们设计一个空间光谱聚合模块, 以有效结合两种类型的线索。 与编码相反, 我们使用逆向轮廓变, 使用学习的低射子波段和波段传方向子波段变换频带, 以测量脱色器的深度。 对三种广受欢迎的全景图像数据集的实验显示, 所提议的方法超越了国家艺术计划, 并更快地融合。 代码可在 http:// com- commus- MO/NUstourformax forforation.