基于跨视角相关性的三维感知多任务学习用于密集场景理解 (3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding)

This paper addresses the challenge of training a single network to jointly perform multiple dense prediction tasks, such as segmentation and depth estimation, i.e., multi-task learning (MTL). Current approaches mainly capture cross-task relations in the 2D image space, often leading to unstructured features lacking 3D-awareness. We argue that 3D-awareness is vital for modeling cross-task correlations essential for comprehensive scene understanding. We propose to address this problem by integrating correlations across views, i.e., cost volume, as geometric consistency in the MTL network. Specifically, we introduce a lightweight Cross-view Module (CvM), shared across tasks, to exchange information across views and capture cross-view correlations, integrated with a feature from MTL encoder for multi-task predictions. This module is architecture-agnostic and can be applied to both single and multi-view data. Extensive results on NYUv2 and PASCAL-Context demonstrate that our method effectively injects geometric consistency into existing MTL methods to improve performance.

翻译：本文旨在解决训练单一网络以联合执行多种密集预测任务（如分割与深度估计）的挑战，即多任务学习（MTL）。现有方法主要在二维图像空间中捕捉跨任务关系，常导致特征缺乏三维感知且结构松散。我们认为，三维感知对于建模跨任务相关性至关重要，是实现全面场景理解的关键。为此，我们提出通过整合跨视角相关性（即代价体积）作为几何一致性约束，将其融入MTL网络中以解决该问题。具体而言，我们引入一个轻量级的跨视角模块（CvM），该模块在任务间共享，用于跨视角信息交换并捕获跨视角相关性，再与MTL编码器提取的特征结合进行多任务预测。此模块与网络架构无关，可适用于单视角及多视角数据。在NYUv2和PASCAL-Context数据集上的大量实验结果表明，我们的方法能有效将几何一致性注入现有MTL方法中，从而提升性能。

相关内容

多任务学习

关注 0

多任务学习（MTL）是机器学习的一个子领域，可以同时解决多个学习任务，同时利用各个任务之间的共性和差异。与单独训练模型相比，这可以提高特定任务模型的学习效率和预测准确性。多任务学习是归纳传递的一种方法，它通过将相关任务的训练信号中包含的域信息用作归纳偏差来提高泛化能力。通过使用共享表示形式并行学习任务来实现,每个任务所学的知识可以帮助更好地学习其它任务。

【ICCV2025】具有局部对齐视觉-语言模型的可解释零样本学习

专知会员服务

10+阅读 · 7月1日

AAAI 2024 | Structure-CLIP: 使用场景图知识增强多模态结构化表示

专知会员服务

38+阅读 · 2024年1月11日

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

22+阅读 · 2023年5月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日