We define the concept of CompositeTasking as the fusion of multiple, spatially distributed tasks, for various aspects of image understanding. Learning to perform spatially distributed tasks is motivated by the frequent availability of only sparse labels across tasks, and the desire for a compact multi-tasking network. To facilitate CompositeTasking, we introduce a novel task conditioning model -- a single encoder-decoder network that performs multiple, spatially varying tasks at once. The proposed network takes an image and a set of pixel-wise dense task requests as inputs, and performs the requested prediction task for each pixel. Moreover, we also learn the composition of tasks that needs to be performed according to some CompositeTasking rules, which includes the decision of where to apply which task. It not only offers us a compact network for multi-tasking, but also allows for task-editing. Another strength of the proposed method is demonstrated by only having to supply sparse supervision per task. The obtained results are on par with our baselines that use dense supervision and a multi-headed multi-tasking design. The source code will be made publicly available at www.github.com/nikola3794/composite-tasking.
翻译:我们定义了复合定位概念, 将其定义为图像理解各个方面的多重、 空间分布的任务的组合。 学习执行空间分布的任务, 是因为任务之间往往只有零散的标签, 并且希望有一个紧凑的多任务网络。 为了便利复合跟踪, 我们引入了一个新颖的任务调节模型 -- -- 一个单一的编码器- 解码器网络, 一次性执行多种、 空间不同的任务。 提议的网络将图像和一组像素密度任务请求作为投入, 并履行每个像素所要求的预测任务。 此外, 我们还了解了需要按照某些复合跟踪规则执行的任务的构成, 其中包括决定执行哪些任务。 它不仅为我们提供了一个多任务的统一网络, 而且还允许任务编辑。 拟议方法的另一种强项通过只能提供稀少的每项任务监督来证明。 所获得的结果与我们的基线相当接近, 该基线使用密集的监督和多头多任务设计。 源代码将公布在 www. gith/ compliksite. 。