Image-based virtual try-on strives to transfer the appearance of a clothing item onto the image of a target person. Prior work focuses mainly on upper-body clothes (e.g. t-shirts, shirts, and tops) and neglects full-body or lower-body items. This shortcoming arises from a main factor: current publicly available datasets for image-based virtual try-on do not account for this variety, thus limiting progress in the field. To address this deficiency, we introduce Dress Code, which contains images of multi-category clothes. Dress Code is more than 3x larger than publicly available datasets for image-based virtual try-on and features high-resolution paired images (1024x768) with front-view, full-body reference models. To generate HD try-on images with high visual quality and rich in details, we propose to learn fine-grained discriminating features. Specifically, we leverage a semantic-aware discriminator that makes predictions at pixel-level instead of image- or patch-level. Extensive experimental evaluation demonstrates that the proposed approach surpasses the baselines and state-of-the-art competitors in terms of visual quality and quantitative results. The Dress Code dataset is publicly available at https://github.com/aimagelab/dress-code.
翻译:基于图像的虚拟试镜试图将衣着物品的外观转换到目标对象的图像上。 先前的工作主要侧重于上体衣( 如 t恤衫、 衬衫和顶部), 忽略全体或下体的物品。 这一缺陷源于一个主要因素: 目前为基于图像的虚拟试镜提供的公开数据集并不反映这种差异, 从而限制实地的进展 。 为解决这一缺陷, 我们引入了Dress 代码, 其中包含多类服装的图像。 服装代码比公开提供的基于图像的虚拟试镜和高分辨率配对图像数据集( 1024x768) 还要大3x多。 生成高视觉质量和丰富细节的HD试镜图像。 我们提议学习精细的区别性特征 。 具体地说, 我们利用一个包含多类服装服装标签歧视的测试器, 在像素层面而不是图像或补丁层面作出预测。 广泛的实验性评估显示, 拟议的方法超过了基于前视像素、 全体参考模型的图像质量/ Dlasma 的量化标准。