Advances in the realm of Generative Adversarial Networks (GANs) have led to architectures capable of producing amazingly realistic images such as StyleGAN2, which, when trained on the FFHQ dataset, generates images of human faces from random vectors in a lower-dimensional latent space. Unfortunately, this space is entangled - translating a latent vector along its axes does not correspond to a meaningful transformation in the output space (e.g., smiling mouth, squinting eyes). The model behaves as a black box, providing neither control over its output nor insight into the structures it has learned from the data. We present a method to explore the manifolds of changes of spatially localized regions of the face. Our method discovers smoothly varying sequences of latent vectors along these manifolds suitable for creating animations. Unlike existing disentanglement methods that either require labelled data or explicitly alter internal model parameters, our method is an optimization-based approach guided by a custom loss function and manually defined region of change. Our code is open-sourced, which can be found, along with supplementary results, on our project page: https://github.com/bmolab/masked-gan-manifold
翻译:基因反转网络(GANs)领域的进步导致了能够产生惊人现实图像的架构,如StyleGAN2等SysteleGAN2,在接受FFHQ数据集培训后,StyleGAN2在低维潜层空间中生成随机矢量的人类面貌图像。不幸的是,这一空间被缠绕 -- 沿轴转换潜伏矢量与输出空间的有意义的转变不相称(例如微笑口、斜眼等)。模型作为黑盒运作,既不控制其输出,也不对其从数据中学习的结构进行洞察。我们展示了一种方法来探索空间局部区域变化的方方面面。我们的方法在这些适合创建动动画的方块上发现了各种相异的潜在矢量。与现有的分解方法不同,即需要贴标签数据或明确改变内部模型参数,我们的方法是一种以定制损失函数和手动定义的改变区域为指导的优化方法。我们的代码是开源的。我们的代码是开源的,可以与补充结果一起在我们的项目页面上找到的: https://gimb/mamaskast/formasket。