Flow-based generative models have become an important class of unsupervised learning approaches. In this work, we incorporate the key ideas of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, RG-Flow, which can separate information at different scales of images and extract disentangled representations at each scale. We demonstrate our method on synthetic multi-scale image datasets and the CelebA dataset, showing that the disentangled representations enable semantic manipulation and style mixing of the images at different scales. To visualize the latent representations, we introduce receptive fields for flow-based models and show that the receptive fields of RG-Flow are similar to those of convolutional neural networks. In addition, we replace the widely adopted isotropic Gaussian prior distribution by the sparse Laplacian distribution to further enhance the disentanglement of representations. From a theoretical perspective, our proposed method has $O(\log L)$ complexity for inpainting of an image with edge length $L$, compared to previous generative models with $O(L^2)$ complexity.
翻译:基于流动的基因变异模型已成为不受监督的学习方法的一个重要类别。 在这项工作中,我们吸收了重新整顿组(RG)和分散的先前分布的关键理念,以设计一个基于分级流的基因变异模型(RG-Flow),该模型可以在不同比例尺图像尺度上分离信息,并在每个比例尺上提取分解的表达方式。我们在合成多尺度图像数据集和CelebA数据集上展示了我们的方法,表明分解的表达方式能够在不同比例尺上对图像进行语义操纵和风格混合。为了将潜在表达方式进行视觉化,我们为流基模型引入了可接受的域,并显示RG-Flow的可接受域类似于同动神经网络。此外,我们用稀释的拉普拉西亚分布方式取代了以前广泛采用的异位高斯分布,以进一步加强表达方式的分解。从理论上看,我们拟议的方法在用边缘长度为$L($L)的图像涂抹时具有$($)的复杂性。