Recent guided depth super-resolution methods are premised on the assumption of strictly spatial alignment between depth and RGB, achieving high-quality depth reconstruction. However, in real-world scenarios, the acquisition of strictly aligned RGB-D is hindered by inherent hardware limitations (e.g., physically separate RGB-D sensors) and unavoidable calibration drift induced by mechanical vibrations or temperature variations. Consequently, existing approaches often suffer inevitable performance degradation when applied to misaligned real-world scenes. In this paper, we propose the Multi-Order Matching Network (MOMNet), a novel alignment-free framework that adaptively retrieves and selects the most relevant information from misaligned RGB. Specifically, our method begins with a multi-order matching mechanism, which jointly performs zero-order, first-order, and second-order matching to comprehensively identify RGB information consistent with depth across multi-order feature spaces. To effectively integrate the retrieved RGB and depth, we further introduce a multi-order aggregation composed of multiple structure detectors. This strategy uses multi-order priors as prompts to facilitate the selective feature transfer from RGB to depth. Extensive experiments demonstrate that MOMNet achieves state-of-the-art performance and exhibits outstanding robustness.
翻译:近期的引导式深度超分辨率方法均基于深度图与RGB图像严格空间对齐的假设,以实现高质量的深度重建。然而,在实际场景中,获取严格对齐的RGB-D数据受到固有硬件限制(例如物理分离的RGB-D传感器)以及机械振动或温度变化引起的不可避免的校准漂移的阻碍。因此,现有方法在应用于未对齐的真实场景时,往往遭受不可避免的性能下降。本文提出多阶匹配网络(MOMNet),一种新颖的无对齐框架,能够自适应地从未对齐的RGB图像中检索并选择最相关的信息。具体而言,我们的方法始于一种多阶匹配机制,该机制联合执行零阶、一阶和二阶匹配,以在多阶特征空间中全面识别与深度信息一致的RGB信息。为了有效整合检索到的RGB与深度信息,我们进一步引入了由多个结构检测器组成的多阶聚合策略。该策略利用多阶先验作为提示,促进从RGB到深度的选择性特征迁移。大量实验表明,MOMNet实现了最先进的性能,并展现出卓越的鲁棒性。