We present a method for finding paths in a deep generative model's latent space that can maximally vary one set of image features while holding others constant. Crucially, unlike past traversal approaches, ours can manipulate multidimensional features of an image such as facial identity and pixels within a specified region. Our method is principled and conceptually simple: optimal traversal directions are chosen by maximizing differential changes to one feature set such that changes to another set are negligible. We show that this problem is nearly equivalent to one of Rayleigh quotient maximization, and provide a closed-form solution to it based on solving a generalized eigenvalue equation. We use repeated computations of the corresponding optimal directions, which we call Rayleigh EigenDirections (REDs), to generate appropriately curved paths in latent space. We empirically evaluate our method using StyleGAN2 on two image domains: faces and living rooms. We show that our method is capable of controlling various multidimensional features out of the scope of previous latent space traversal methods: face identity, spatial frequency bands, pixels within a region, and the appearance and position of an object. Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.
翻译:我们提出了在深重基因模型潜伏空间寻找路径的方法,该方法可以最大程度地改变一组图像特征,同时保持其他恒定。 与以往的跨度方法不同,我们的方法与以往的做法不同,我们的方法可以操纵一个图像的多维特征,如脸部身份和特定区域内的像素。 我们的方法是原则性和概念上简单的:通过对一个特征进行最大程度的差别改变,选择了最佳的跨度方向,这样对另一个特征的改变是微不足道的。我们表明,这个问题几乎相当于Rayleigh商数最大化,并且提供了一种封闭式的解决方案,其基础是解决一个普遍的电子价值方程式。我们反复计算了相应的最佳方向,我们称之为Rayleg EigenDisctions(REDs),以在潜在空间产生适当的弯曲路径。 我们用StylegGAN2在两个图像领域(面和客厅)上选择了我们的方法。 我们表明,我们的方法能够控制前几个潜在空间穿透方法的范围的各种多维特征: 面特征、空间频带、 区域内的像素系、 以及地层空间空间的外观和位置分析。