In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person. Though researchers have recently proposed several methods to achieve this task, most of these techniques derive the target pose directly from the desired target image on a specific dataset, making the underlying process challenging to apply in real-world scenarios as the generation of the target image is the actual aim. In this paper, we first present the shortcomings of current pose transfer algorithms and then propose a novel text-based pose transfer technique to address those issues. We divide the problem into three independent stages: (a) text to pose representation, (b) pose refinement, and (c) pose rendering. To the best of our knowledge, this is one of the first attempts to develop a text-based pose transfer framework where we also introduce a new dataset DF-PASS, by adding descriptive pose annotations for the images of the DeepFashion dataset. The proposed method generates promising results with significant qualitative and quantitative scores in our experiments.
翻译:在计算机愿景中,人造合成和传输涉及以先前看不见的外形生成一个人的概率性图像。虽然研究人员最近提出了完成这项任务的几种方法,但大多数这些技术都直接来自特定数据集的预期目标图像,使在现实情景中应用的基本过程具有挑战性,因为生成目标图像是实际目标。在本文件中,我们首先介绍当前外形转换算法的缺点,然后提出一个新的基于文字的外形转换技术来解决这些问题。我们把问题分为三个独立阶段:(a) 文本代表,(b) 进行改进,(c) 作出改进。根据我们所知,这是首次尝试制定基于文字的外形转换框架,我们在此也引入一个新的数据集DF-PASS,为DeepFashian数据集的图像添加描述性说明。拟议方法将产生令人乐观的结果,在实验中取得重大的定性和定量评分。