Virtual try-on(VTON) aims at fitting target clothes to reference person images, which is widely adopted in e-commerce.Existing VTON approaches can be narrowly categorized into Parser-Based(PB) and Parser-Free(PF) by whether relying on the parser information to mask the persons' clothes and synthesize try-on images. Although abandoning parser information has improved the applicability of PF methods, the ability of detail synthesizing has also been sacrificed. As a result, the distraction from original cloth may persistin synthesized images, especially in complicated postures and high resolution applications. To address the aforementioned issue, we propose a novel PF method named Regional Mask Guided Network(RMGN). More specifically, a regional mask is proposed to explicitly fuse the features of target clothes and reference persons so that the persisted distraction can be eliminated. A posture awareness loss and a multi-level feature extractor are further proposed to handle the complicated postures and synthesize high resolution images. Extensive experiments demonstrate that our proposed RMGN outperforms both state-of-the-art PB and PF methods.Ablation studies further verify the effectiveness ofmodules in RMGN.
翻译:虚拟试运行(VTON)的目的是为在电子商务中广泛采用的个人图像提供参考目标衣物(VTON),在电子商业中广泛采用。 VTON的做法,如果依靠剖析器信息来掩盖个人的衣服并合成试运行图像,就可以将之严格分为Parser-Based(PB)和Parser-Free(PFF)两种。虽然放弃剖析器信息提高了剖析器方法的可适用性,但细节合成能力也被牺牲了。因此,对原始布料的综合图像的分心可能持续存在,特别是在复杂的姿态和高分辨率应用中。为了解决上述问题,我们建议采用名为区域面具指导网络(RMGN)的新型PFS方法。更具体地说,建议采用区域面具,将目标衣服和参照人的特征明确结合,以便消除长期分散的干扰。还进一步提议了态势感知力损失和多层次特征提取器来处理复杂的态势和合成高分辨率图像。广泛的实验表明,我们提议的RMGNGN在PB和PFPF方法中都超越了状态。