This research features a deep-learning based framework to address the problem of matching a given face sketch image against a face photo database. The problem of photo-sketch matching is challenging because 1) there is large modality gap between photo and sketch, and 2) the number of paired training samples is insufficient to train deep learning based networks. To circumvent the problem of large modality gap, our approach is to use an intermediate latent space between the two modalities. We effectively align the distributions of the two modalities in this latent space by employing a bidirectional (photo -> sketch and sketch -> photo) collaborative synthesis network. A StyleGAN-like architecture is utilized to make the intermediate latent space be equipped with rich representation power. To resolve the problem of insufficient training samples, we introduce a three-step training scheme. Extensive evaluation on public composite face sketch database confirms superior performance of our method compared to existing state-of-the-art methods. The proposed methodology can be employed in matching other modality pairs.
翻译:这项研究有一个深层学习基础框架,以解决将一张面部素描图像与一张面部照片数据库相匹配的问题。相片牵线匹配的问题具有挑战性,因为:(1) 相片和素描之间存在巨大模式差距,(2) 配对培训样本的数量不足以培训深层学习网络。为避免模式差距大的问题,我们的方法是使用两种模式之间的中间潜伏空间。我们通过使用双向(phopto - > 草图和素描 - > 照片)合作合成网络,有效地统一了这一潜在空间两种模式的分布。一个类似StyleGAN的架构被用来使中间潜层空间配备丰富的代表力量。为解决培训样本不足的问题,我们引入了一个三步制培训计划。对公共面部面图谱数据库的广泛评估证实了我们方法与其他最新方法相比的优异性。拟议方法可用于匹配其他模式配对。