Display technologies have evolved over the years. It is critical to develop practical HDR capturing, processing, and display solutions to bring 3D technologies to the next level. Depth estimation of multi-exposure stereo image sequences is an essential task in the development of cost-effective 3D HDR video content. In this paper, we develop a novel deep architecture for multi-exposure stereo depth estimation. The proposed architecture has two novel components. First, the stereo matching technique used in traditional stereo depth estimation is revamped. For the stereo depth estimation component of our architecture, a mono-to-stereo transfer learning approach is deployed. The proposed formulation circumvents the cost volume construction requirement, which is replaced by a ResNet based dual-encoder single-decoder CNN with different weights for feature fusion. EfficientNet based blocks are used to learn the disparity. Secondly, we combine disparity maps obtained from the stereo images at different exposure levels using a robust disparity feature fusion approach. The disparity maps obtained at different exposures are merged using weight maps calculated for different quality measures. The final predicted disparity map obtained is more robust and retains best features that preserve the depth discontinuities. The proposed CNN offers flexibility to train using standard dynamic range stereo data or with multi-exposure low dynamic range stereo sequences. In terms of performance, the proposed model surpasses state-of-the-art monocular and stereo depth estimation methods, both quantitatively and qualitatively, on challenging Scene flow and differently exposed Middlebury stereo datasets. The architecture performs exceedingly well on complex natural scenes, demonstrating its usefulness for diverse 3D HDR applications.
翻译:多年来,显示技术不断演变。 开发实用的《人类发展报告》捕捉、处理和显示解决方案以将3D技术提升到下一个水平至关重要。 深度估算多曝光立体声图像序列是开发具有成本效益的3D人类发展报告视频内容的一项基本任务。 在本文中,我们开发了一个全新的多曝光立体深度估算的深层次架构。 拟议架构有两个新构件。 首先, 更新传统立体深度估算中使用的立体匹配技术。 对于我们架构的立体深度估算部分, 采用了一个从立体到立体转移的学习方法。 拟议的构件绕过了成本量构建要求, 由基于双编码的单解立体图像序列取代, 具有不同的立体混集度。 高效的网络块被用于学习差异。 其次, 我们结合了在不同曝光级别从立体图像中获取的差异图, 使用根据不同质量计量的重量测算, 最终预测的差异分布图更加稳健, 保留了成本深度深度的准确度结构, 并保留了以特性深度的直径直径直径直径直径直径直径直径直径直径直的S, 3号直径直径直径直径直径直径直, 。