Previous virtual try-on methods usually focus on aligning a clothing item with a person, limiting their ability to exploit the complex pose, shape and skin color of the person, as well as the overall structure of the clothing, which is vital to photo-realistic virtual try-on. To address this potential weakness, we propose a fill in fabrics (FIFA) model, a self-supervised conditional generative adversarial network based framework comprised of a Fabricator and a unified virtual try-on pipeline with a Segmenter, Warper and Fuser. The Fabricator aims to reconstruct the clothing image when provided with a masked clothing as input, and learns the overall structure of the clothing by filling in fabrics. A virtual try-on pipeline is then trained by transferring the learned representations from the Fabricator to Warper in an effort to warp and refine the target clothing. We also propose to use a multi-scale structural constraint to enforce global context at multiple scales while warping the target clothing to better fit the pose and shape of the person. Extensive experiments demonstrate that our FIFA model achieves state-of-the-art results on the standard VITON dataset for virtual try-on of clothing items, and is shown to be effective at handling complex poses and retaining the texture and embroidery of the clothing.
翻译:先前的虚拟试运行方法通常侧重于将衣物与某人配对,限制其利用个人复合面貌、形状和肤色以及整个服装结构的能力,这对于光化现实虚拟试运行至关重要。为了解决这一潜在的弱点,我们提议填补织物模型(FIFA),一个由制造者组成的自我监督的有条件的基因对抗网络框架,以及一个由分块器、Warper和Fuser组成的统一的虚拟试运行管道。制造者的目的是在提供遮盖服装作为投入时重建服装形象,通过装饰来学习服装的总体结构。然后,通过将精通的布局转换到Warper来培训虚拟试运行管道,以努力扭曲和完善目标服装。我们还提议使用一个多尺度的结构制约,在多尺度上强制执行全球环境,同时将目标衣物扭曲成更适合人的面貌和形状。广泛的实验表明,我们的国际足联模型通过装饰物结构修饰,学习服装的总体结构结构。然后通过将精通的试运行管道进行培训,将精通的布图像演示,以保持标准的六式制的文本,以保存。