In the context of generative models, text-to-image generation achieved impressive results in recent years. Models using different approaches were proposed and trained in huge datasets of pairs of texts and images. However, some methods rely on pre-trained models such as Generative Adversarial Networks, searching through the latent space of the generative model by using a gradient-based approach to update the latent vector, relying on loss functions such as the cosine similarity. In this work, we follow a different direction by proposing the use of Covariance Matrix Adaptation Evolution Strategy to explore the latent space of Generative Adversarial Networks. We compare this approach to the one using Adam and a hybrid strategy. We design an experimental study to compare the three approaches using different text inputs for image generation by adapting an evaluation method based on the projection of the resulting samples into a two-dimensional grid to inspect the diversity of the distributions. The results evidence that the evolutionary method achieves more diversity in the generation of samples, exploring different regions of the resulting grids. Besides, we show that the hybrid method combines the explored areas of the gradient-based and evolutionary approaches, leveraging the quality of the results.
翻译:在基因模型方面,利用文字到图像的生成近年来取得了令人印象深刻的成果; 提议了采用不同方法的模型,并在大量文本和图像数据集方面进行了培训; 然而,一些方法依靠预先培训的模型,如“创形对反网络”,利用基于梯度的方法更新潜在矢量,在基因模型的潜在空间中搜索,依靠诸如相近性等损失功能; 在这项工作中,我们采用不同的方向,提议使用“变换矩阵适应进化战略”探索基因反向网络的潜在空间; 我们用亚当和混合战略将这一方法进行比较; 我们设计了一项实验性研究,将三种方法进行比较,利用不同的文字输入来生成图像,方法是根据对结果样品的预测,将评价方法调整成二维电网,以检查分布的多样性; 结果表明,进化方法在样品的生成中实现了更大的多样性,并探索由此形成的电网的不同区域; 此外,我们表明,混合方法将基于梯度和进化方法的探索领域结合起来,利用结果的质量。