Deep generative models have made great progress in synthesizing images with arbitrary human poses and transferring poses of one person to others. Though many different methods have been proposed to generate images with high visual fidelity, the main challenge remains and comes from two fundamental issues: pose ambiguity and appearance inconsistency. To alleviate the current limitations and improve the quality of the synthesized images, we propose a pose transfer network with augmented Disentangled Feature Consistency (DFC-Net) to facilitate human pose transfer. Given a pair of images containing the source and target person, DFC-Net extracts pose and static information from the source and target respectively, then synthesizes an image of the target person with the desired pose from the source. Moreover, DFC-Net leverages disentangled feature consistency losses in the adversarial training to strengthen the transfer coherence and integrates a keypoint amplifier to enhance the pose feature extraction. With the help of the disentangled feature consistency losses, we further propose a novel data augmentation scheme that introduces unpaired support data with the augmented consistency constraints to improve the generality and robustness of DFC-Net. Extensive experimental results on Mixamo-Pose and EDN-10k have demonstrated DFC-Net achieves state-of-the-art performance on pose transfer.
翻译:深层基因模型在将图像与任意人造相和将一个人的造型相合成方面取得了巨大进展。虽然提出了许多不同方法来生成高视觉忠诚的图像,但主要挑战仍然存在,并且来自两个基本问题:造成模糊和外观不一致。为了减轻目前的局限性,提高合成图像的质量,我们提议建立一个配置传输网络,增加分解地貌一致性(DFC-Net),以便利人造相的转移。鉴于一对包含源和目标人的图像,DFC-Net提取了来源和目标分别提供的静态信息,然后将目标人的形象与源的预期形象合成在一起。此外,DFC-Net的杠杆在加强传输一致性和整合关键点放大器以加强合成图像的提取方面,在对抗性培训中造成一致性损失。我们进一步提议建立一个新的数据增强计划,在强化一致性的制约下引入不匹配的数据支持数据,以提高DFC-Net的通用性和稳健性。DFC-Net在MA-Nix上展示了广泛的实验性成果。