We introduce a new architecture called a conditional invertible neural network (cINN), and use it to address the task of diverse image-to-image translation for natural images. This is not easily possible with existing INN models due to some fundamental limitations. The cINN combines the purely generative INN model with an unconstrained feed-forward network, which efficiently preprocesses the conditioning image into maximally informative features. All parameters of a cINN are jointly optimized with a stable, maximum likelihood-based training procedure. Even though INN-based models have received far less attention in the literature than GANs, they have been shown to have some remarkable properties absent in GANs, e.g. apparent immunity to mode collapse. We find that our cINNs leverage these properties for image-to-image translation, demonstrated on day to night translation and image colorization. Furthermore, we take advantage of our bidirectional cINN architecture to explore and manipulate emergent properties of the latent space, such as changing the image style in an intuitive way.
翻译:我们引入了一个称为有条件的不可视神经网络(cINN)的新架构, 并用它来解决自然图像的多种图像到图像翻译任务。 由于一些根本性的局限性, 这对于现有的 INN 模型来说并不容易。 cINN 将纯基因的 INN 模型与一个不加限制的进化- 向前网络结合起来, 从而有效地将调控图像处理成信息量最大化的功能。 cINN 的所有参数都与一个稳定、 最大可能性的培训程序共同优化。 即使基于 INN 的模型在文献中比GAN 得到的注意要少得多, 但它们在 GANs 中有某些显著的特性, 例如模式崩溃的明显免疫性。 我们发现我们的 CINN 将这些特性用于图像到图像的翻译, 白天演示到夜间的翻译和图像的颜色化。 此外, 我们利用我们的双向 CINN 架构来探索和操纵潜在空间的突发特性, 比如以直觉的方式改变图像样式。