与多查询变换器进行多任务学习,促进高意识预测 (Multi-Task Learning with Multi-query Transformer for Dense Prediction)

Previous multi-task dense prediction studies developed complex pipelines such as multi-modal distillations in multiple stages or searching for task relational contexts for each task. The core insight beyond these methods is to maximize the mutual effects between each task. Inspired by the recent query-based Transformers, we propose a simpler pipeline named Multi-Query Transformer (MQTransformer) that is equipped with multiple queries from different tasks to facilitate the reasoning among multiple tasks and simplify the cross task pipeline. Instead of modeling the dense per-pixel context among different tasks, we seek a task-specific proxy to perform cross-task reasoning via multiple queries where each query encodes the task-related context. The MQTransformer is composed of three key components: shared encoder, cross task attention and shared decoder. We first model each task with a task-relevant and scale-aware query, and then both the image feature output by the feature extractor and the task-relevant query feature are fed into the shared encoder, thus encoding the query feature from the image feature. Secondly, we design a cross task attention module to reason the dependencies among multiple tasks and feature scales from two perspectives including different tasks of the same scale and different scales of the same task. Then we use a shared decoder to gradually refine the image features with the reasoned query features from different tasks. Extensive experiment results on two dense prediction datasets (NYUD-v2 and PASCAL-Context) show that the proposed method is an effective approach and achieves the state-of-the-art result. Code will be available.

翻译：先前的多任务密集预测研究开发了复杂管道,如多个阶段的多式蒸馏或为每项任务寻找任务关系背景。这些方法之外的核心洞察力是最大限度地发挥每项任务之间的相互效应。受最近基于查询的变异器的启发,我们建议了一个更简单的管道,名为多查询变异器(MQTransfrender),由不同任务提供多种查询,以便利多个任务之间的推理和简化跨任务管道。我们不是在不同任务之间模拟密集的每像素背景,而是通过多个查询进行跨任务推理。我们寻求一个任务特定的代理,在每个查询中解析与任务相关的情况。MQTranserorent由三个关键部分组成:共享编码、交叉任务关注和共享解码。我们第一个模型,每个任务都有与任务相关和规模的查询,然后将功能提取的图像和Contractal的查询功能输入到共同的版本,从而从两个角度设计跨任务关注模块到不同层次的图像分析模块。我们设计了一个跨任务的跨任务模块,从一个不同层次,从不同任务和跨级任务和跨级任务、跨级任务、跨级任务和跨级任务、跨级任务和跨级任务、跨级任务和跨级任务、跨级任务、跨级任务、跨级任务和跨级、跨级任务、跨级任务、跨级任务和跨级任务、跨级任务和跨级任务、跨级任务、跨级任务、跨级任务、跨级任务、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级任务和跨级、跨级任务和跨级任务、跨级任务、跨级任务、跨级任务、跨级任务、跨级任务、跨级任务、跨级、跨级任务、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级任务、跨级、跨级、跨级、跨级任务、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级、跨级任务、跨级任务、跨级、跨级、跨级、跨级任务、跨