注意:人类图像合成统一框架 (Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis)

We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis, within a unified framework. It means that the model, once being trained, can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints to estimate the human body structure. However, they only express the position information with no abilities to characterize the personalized shape of the person and model the limb rotations. In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape. It can not only model the joint location and rotation but also characterize the personalized body shape. To preserve the source information, such as texture, style, color, and face identity, we propose an Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB) that propagates the source information in both image and feature spaces to the synthesized reference. Specifically, the source features are extracted by a denoising convolutional auto-encoder for characterizing the source identity well. Furthermore, our proposed method can support a more flexible warping from multiple sources. To further improve the generalization ability of the unseen source images, a one/few-shot adversarial learning is applied. In detail, it firstly trains a model in an extensive training set. Then, it finetunes the model by one/few-shot unseen image(s) in a self-supervised way to generate high-resolution (512 x 512 and 1024 x 1024) results. Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis. Extensive experiments demonstrate the effectiveness of our methods in terms of preserving face identity, shape consistency, and clothes details. All codes and dataset are available on https://impersonator.org/work/impersonator-plus-plus.html.

翻译：我们在一个统一的框架内处理人类图像合成,包括人体运动模仿、外观传输和新视图合成。这意味着模型一旦经过培训,就可以用于处理所有这些任务。现有的特定任务方法主要使用 2D 键点来估计人体结构。但是, 它们只是表达位置信息, 没有能力来描述人体个性化形状, 并且模拟肢体旋转。在本文中, 我们提议使用 3D 体网点恢复模块来解析形状和形状。它不仅可以模拟联合位置和旋转, 还可以描述个性化身体形状。为了保存源信息, 例如纹理、风格、颜色和面貌特性, 我们建议使用 2D 键键键来显示 GAN, 并且用注意性液体旋转键旋转键来显示 12 个人形的个性化形状。具体地说, 3DM 的源代码是通过一个调制模化的自动编码来提取的。此外, 我们提议的方法可以支持从多个来源更灵活的调调的, 例如文本、风格、风格、方向数据转换一个直观数据数据、直观数据数据直观分析、 10 数据直观数据直观、直观、直观、直观、直观、数据直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、数据直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、向、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、