从纵向划界单一输入中生成的 Alpha Matte 生成 (Alpha Matte Generation from Single Input for Portrait Matting)

Portrait matting is an important research problem with a wide range of applications, such as video conference app, image/video editing, and post-production. The goal is to predict an alpha matte that identifies the effect of each pixel on the foreground subject. Traditional approaches and most of the existing works utilized an additional input, e.g., trimap, background image, to predict alpha matte. However, providing additional input is not always practical. Besides, models are too sensitive to these additional inputs. In this paper, we introduce an additional input-free approach to perform portrait matting using Generative Adversarial Nets (GANs). We divide the main task into two subtasks. For this, we propose a segmentation network for the person segmentation and the alpha generation network for alpha matte prediction. While the segmentation network takes an input image and produces a coarse segmentation map, the alpha generation network utilizes the same input image as well as a coarse segmentation map that is produced by the segmentation network to predict the alpha matte. Besides, we present a segmentation encoding block to downsample the coarse segmentation map and provide feature representation to the residual block. Furthermore, we propose border loss to penalize only the borders of the subject separately which is more likely to be challenging and we also adapt perceptual loss for portrait matting. To train the proposed system, we combine two different popular training datasets to improve the amount of data as well as diversity to address domain shift problems in the inference time. We tested our model on three different benchmark datasets, namely Adobe Image Matting dataset, Portrait Matting dataset, and Distinctions dataset. The proposed method outperformed the MODNet method that also takes a single input.

翻译：光线交配是一个重要的研究问题, 包括视频会议应用程序、图像/ 视频编辑、制作后等多种应用。目标是预测一个 Alpha matte, 确定每个像素对前景主题的影响。传统的方法和大部分现有作品都使用了额外的输入, 例如滴图、背景图像, 来预测阿尔法面。但是, 提供额外的输入并不总是实用的。此外, 模型对这些额外投入过于敏感。在本文中, 我们引入了一种额外的不使用 Genealation Adversarial Nets (GANs) 来进行肖像化配配方的无内容化方法。我们将主要任务分为两个子任务。为此, 我们提议了一个人分解和阿尔法生成网络网络, 用于预测阿尔法。分解网络使用相同的输入模型以及由分解网络生成的粗略分解分解图来预测阿尔法。此外, 我们用一个分解法将主要任务分为两个直径的图像区域, 将数据转换为我们的数据路段。