Video creation has been an attractive yet challenging task for artists to explore. With the advancement of deep learning, recent works try to utilize deep convolutional neural networks to synthesize a video with the aid of a guiding video, and have achieved promising results. However, the acquisition of guiding videos, or other forms of guiding temporal information is costly expensive and difficult in reality. Therefore, in this work we introduce a new video synthesis task by employing two rough bad-drwan sketches only as input to create a realistic portrait video. A two-stage Sketch-to-Video model is proposed, which consists of two key novelties: 1) a feature retrieve and projection (FRP) module, which parititions the input sketch into different parts and utilizes these parts for synthesizing a realistic start or end frame and meanwhile generating rich semantic features, is designed to alleviate the sketch out-of-domain problem due to arbitrarily drawn free-form sketch styles by different users. 2) A motion projection followed by feature blending module, which projects a video (used only in training phase) into a motion space modeled by normal distribution and blends the motion variables with semantic features extracted above, is proposed to alleviate the guiding temporal information missing problem in the test phase. Experiments conducted on a combination of CelebAMask-HQ and VoxCeleb2 dataset well validate that, our method can acheive both good quantitative and qualitative results in synthesizing high-quality videos from two rough bad-drawn sketches.
翻译:视频创作对于艺术家来说是一个吸引人而又具有挑战性的任务。随着深层次学习的进步,最近的工作试图利用深层进化神经网络,在指导视频的帮助下对视频进行合成,并取得了有希望的成果。然而,获取指导视频或其他形式的指导时间信息成本高昂,在现实中成本高昂,而且困难重重。因此,在这项工作中,我们引入了新的视频合成任务,仅将两个粗糙的坏德卢旺达草图作为输入,以创建一个现实的肖像视频。 提出了两阶段的Strach-Video模型,由两个关键的新颖内容组成:1)功能检索和投影(FRP)模块,将输入的素描缩放在不同部分,并利用这些部件合成一个现实的开始或结束框架,并同时产生丰富的语义特征。因此,我们引入了一个新的视频合成任务,因为不同用户任意绘制了自由成型的素描样式。 2) 以特征混合模块为组合,其中将一个视频(仅用于培训阶段)投影成一个动态空间模型,通过正常的分布和混和混混编的图像阶段,在Selam-A的高级测试阶段,在Slimal-Abroal-hal-hal-hroal-hrob 上,在模拟中,将一个正确的模型中将一个模拟,一个模拟了我们测测算出一个正常分配了两个测算结果。