Generative adversarial networks (GANs) are popular for generative tasks; however, they often require careful architecture selection, extensive empirical tuning, and are prone to mode collapse. To overcome these challenges, we propose a novel model that identifies the low-dimensional structure of the underlying data distribution, maps it into a low-dimensional latent space while preserving the underlying geometry, and then optimally transports a reference measure to the embedded distribution. We prove three key properties of our method: 1) The encoder preserves the geometry of the underlying data; 2) The generator is $c$-cyclically monotone, where $c$ is an intrinsic embedding cost employed by the encoder; and 3) The discriminator's modulus of continuity improves with the geometric preservation of the data. Numerical experiments demonstrate the effectiveness of our approach in generating high-quality images and exhibiting robustness to both mode collapse and training instability.
翻译:暂无翻译