This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors. Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure, jointly with differentiable rendering, to synthesize new videos of people in different contexts from where they were initially recorded. Unlike recent appearance transferring methods, our approach can reconstruct a fully controllable 3D texture-mapped model of a person, while taking into account the manifold structure from body shape and texture appearance in the view synthesis. Specifically, our approach models mesh deformations with a three-stage GCN trained in a self-supervised manner on rendered silhouettes of the human body. It also infers texture appearance with a convolutional network in the texture domain, which is trained in an adversarial regime to reconstruct human texture from rendered images of actors in different poses. Experiments on different videos show that our method successfully infers specific body deformations and avoid creating texture artifacts while achieving the best values for appearance in terms of Structural Similarity (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), Mean Squared Error (MSE), and Fr\'echet Video Distance (FVD). By taking advantages of both differentiable rendering and the 3D parametric model, our method is fully controllable, which allows controlling the human synthesis from both pose and rendering parameters. The source code is available at https://www.verlab.dcc.ufmg.br/retargeting-motion/wacv2022.
翻译:本文建议一个新的端到端神经化结构, 以转移外观和重新激活人类行为者。 具体而言, 我们的方法利用精心设计的图形进化网络( GCN) 来模拟人的身体结构, 并结合不同的翻版, 合成不同背景的人的新视频。 与最近的外观转换方法不同, 我们的方法可以重建一个完全可控的 3D 纹理模型, 同时考虑到从身体形状到视觉合成的纹理的多重结构。 具体地说, 我们的方法模型以三阶段GCN 进行自我监督的参数培训, 来模拟人体结构结构结构的变形网络。 它还会用结构变形网络的纹理外观, 这个系统在对抗性机制中受过训练, 用来从不同形态的行为者图像中重建人文纹理。 不同视频实验显示, 我们的方法成功地推断了特定的体形变形变形和纹理, 避免创建纹理工艺品, 同时在结构相似性( SS- IM) 上经过自我监督的三阶段GE- Vervitual 和图像分析方法中, 能够完全分析 。