Body segmentation is an important step in many computer vision problems involving human images and one of the key components that affects the performance of all downstream tasks. Several prior works have approached this problem using a multi-task model that exploits correlations between different tasks to improve segmentation performance. Based on the success of such solutions, we present in this paper a novel multi-task model for human segmentation/parsing that involves three tasks, i.e., (i) keypoint-based skeleton estimation, (ii) dense pose prediction, and (iii) human-body segmentation. The main idea behind the proposed Segmentation--Pose--DensePose model (or SPD for short) is to learn a better segmentation model by sharing knowledge across different, yet related tasks. SPD is based on a shared deep neural network backbone that branches off into three task-specific model heads and is learned using a multi-task optimization objective. The performance of the model is analysed through rigorous experiments on the LIP and ATR datasets and in comparison to a recent (state-of-the-art) multi-task body-segmentation model. Comprehensive ablation studies are also presented. Our experimental results show that the proposed multi-task (segmentation) model is highly competitive and that the introduction of additional tasks contributes towards a higher overall segmentation performance.
翻译:在涉及人类图像和影响到所有下游任务业绩的关键组成部分之一的许多计算机视觉问题中,身体分解是一个重要的步骤,涉及人类图像和影响所有下游任务业绩的许多计算机视觉问题中的一个重要步骤。一些先前的工作采用多任务模型来处理这个问题,该模型利用不同任务之间的相互关系来改进分解性表现。根据这些解决办法的成功,我们在本文件中提出了一个人类分解/分解的新颖的多任务模型,它涉及三项任务,即(一) 以关键点为基础的骨骼估计,(二) 密集成形预测,和(三) 人体分解。拟议的分解模式的主要思想是,通过在不同、但相关的任务之间分享知识来学习更好的分解模式。 SPD基于一个共同的深神经网络骨干,将分解分为三个特定任务的模型头部,并利用一个多任务优化目标来学习。模型的性能通过对LIP和ATR数据集进行严格的实验分析,并与最近(正态)的多任务组合模型模型模型(或短期的SPD)相比,一个更好的分解模式(我们提出的高层次的多任务结构结构结构) 也显示一个拟议的高级实验结果。