Editing facial expressions by only changing what we want is a long-standing research problem in Generative Adversarial Networks (GANs) for image manipulation. Most of the existing methods that rely only on a global generator usually suffer from changing unwanted attributes along with the target attributes. Recently, hierarchical networks that consist of both a global network dealing with the whole image and multiple local networks focusing on local parts are showing success. However, these methods extract local regions by bounding boxes centred around the sparse facial key points which are non-differentiable, inaccurate and unrealistic. Hence, the solution becomes sub-optimal, introduces unwanted artefacts degrading the overall quality of the synthetic images. Moreover, a recent study has shown strong correlation between facial attributes and local semantic regions. To exploit this relationship, we designed a unified architecture of semantic segmentation and hierarchical GANs. A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model, and on backward pass gradients from hierarchical GANs are propagated to the semantic segmentation network, which makes our framework an end-to-end differentiable architecture. This allows both architectures to benefit from each other. To demonstrate its advantages, we evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ across two popular architectures, BiSeNet and UNet. Our extensive quantitative and qualitative evaluations on both face semantic segmentation and face expression manipulation tasks validate the effectiveness of our work over existing state-of-the-art methods.
翻译:仅通过改变我们想要的东西来编辑面部表达方式,只是改变我们想要的东西,这是General Aversarial Network (GANs) 中长期存在的图像操纵研究问题。 仅依靠全球发电机的现有方法大多会随着目标属性的变化而变化。 最近, 由全球网络组成的等级网络, 处理整幅图像, 以及以本地部分为重点的多个本地网络, 取得了成功。 但是, 这些方法通过围绕分散的面部关键点的捆绑箱, 以不可区分、 不准确和不切实际的方式, 来吸引本地区域。 因此, 解决方案变成了亚非最佳, 引入了不需要的人工制品, 降低了合成图像的整体质量。 此外, 最近的一项研究显示, 面部特征和本地语区之间有着强烈的关联。 为了利用这种关系, 我们设计了一个统一的语系分割和等级 GANs 网络结构的独有优势, 在前方通过语系分割网络中, 设置了感化模型, 以及从高级GANs 的后端梯度梯度, 向语系 网络分割网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化网络化,, 以及我们两个不同的结构化、 、 、 和结构化分析, 以及亚义化、 结构化、 以及亚义化结构化结构化、 以及亚基基化、 系统化、 以及亚基化、亚化、 系统化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚基化、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚基化、亚基化、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、亚、