Most automatic matting methods try to separate the salient foreground from the background. However, the insufficient quantity and subjective bias of the current existing matting datasets make it difficult to fully explore the semantic association between object-to-object and object-to-environment in a given image. In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task. SPG-IM can better associate inter-objects and object-to-environment saliency, and compensate the subjective nature of image matting and its expensive annotation. We also introduce a textual Semantic Transformation (TST) module that can effectively transform and integrate the semantic feature stream to guide the visual representations. In addition, an Adaptive Focal Transformation (AFT) Refinement Network is proposed to adaptively switch multi-scale receptive fields and focal points to enhance both global and local details. Extensive experiments demonstrate the effectiveness of situational perception guidance from the visual-to-textual tasks on image matting, and our model outperforms the state-of-the-art methods. We also analyze the significance of different components in our model. The code will be released soon.
翻译:多数自动交配方法试图将突出的表面与背景区分开来。 但是,由于目前现有交配数据集的数量不足和主观偏差不够,因此很难在给定图像中充分探索对象对对象与对象对环境之间的语义联系。 在本文中,我们建议采用一种“情况感知导图示图示(SPG-IM)”方法,以缓解交配说明的主观偏向,并捕捉充分的情况感知信息,以便从视觉到文字的任务中提炼出更好的全球显著信息。 SPG-IM可以更好地将跨对象和对象对环境的突出特征联系起来,并补偿图像交配的主观性质及其昂贵的注释。 我们还引入了一个文本感知感知图象变异模式模块(TST),可以有效地转换和整合语义特征流以指导视觉表达。 此外,还提议采用适应性焦点变换(AFT)网络,以适应性地转换多尺度的可接受字段和联络点,以加强全球和地方的细节。 广泛实验可以证明我们图像变形的模型化指导的实效,我们从视觉到图像分析的不同分析方法。