Latest Generative Adversarial Networks (GANs) are gathering outstanding results through a large-scale training, thus employing models composed of millions of parameters requiring extensive computational capabilities. Building such huge models undermines their replicability and increases the training instability. Moreover, multi-channel data, such as images or audio, are usually processed by real-valued convolutional networks that flatten and concatenate the input, losing any intra-channel spatial relation. To address these issues, here we propose a family of quaternion-valued generative adversarial networks (QGANs). QGANs exploit the properties of quaternion algebra, e.g., the Hamilton product for convolutions. This allows to process channels as a single entity and capture internal latent relations, while reducing by a factor of 4 the overall number of parameters. We show how to design QGANs and to extend the proposed approach even to advanced models. We compare the proposed QGANs with real-valued counterparts on multiple image generation benchmarks. Results show that QGANs are able to generate visually pleasing images and to obtain better FID scores with respect to their real-valued GANs. Furthermore, QGANs save up to 75% of the training parameters. We believe these results may pave the way to novel, more accessible, GANs capable of improving performance and saving computational resources.
翻译:为了解决这些问题,我们建议采用由需要大量计算能力的数以百万计参数组成的模型。建立这些巨大模型会破坏其复制能力,增加培训不稳定性。此外,图像或音频等多渠道数据通常由实际价值的连带网络处理,这些网络平整和集中输入,失去任何频道内空间关系。为了解决这些问题,我们提议建立一个四分制有价值的基因对抗网络(QGANs)组成的组合。QGANs利用了需要大量计算能力的Kettnion 代数的特性,例如汉密尔顿变数产品。这样可以将通道作为单一实体处理,捕捉到内部潜在关系,同时减少总参数数的4倍。我们展示如何设计QGANs,将拟议的办法扩大到甚至先进的模式。我们把拟议的QGANs与在多图像生成基准上具有实际价值的可获取性能对等的对等网络(QGANs)加以比较。结果显示,QGANs能够生成视觉的图像,例如汉密尔顿变数产品。这可以将通道作为单一实体处理,捕捉取内部潜在关系,同时减少总价值的参数。我们相信这些价值的GAN值。我们可能改进其实际价值的GAN的GAN的成绩。