Despite remarkable performance in producing realistic samples, Generative Adversarial Networks (GANs) often produce low-quality samples near low-density regions of the data manifold, especially for samples with minor features. Many techniques have been developed to improve the quality of generated samples, either by post-processing generated samples or by pre-processing the empirical data distribution, but at the cost of reduced diversity. To promote diversity in sample generation without degrading the overall quality, we propose a simple yet effective method to diagnose and emphasize underrepresented samples during training of a GAN. The main idea is to use the statistics of the discrepancy between the data distribution and the model distribution at each data instance. Based on the observation that the underrepresented samples have a high average discrepancy or high variability in discrepancy, we propose a method to emphasize those samples during training of a GAN. Our experimental results demonstrate that the proposed method improves GAN performance on various datasets, and it is especially effective in improving the quality and diversity of generated samples with minor features.
翻译:尽管在生产符合实际的样品方面表现显著,但基因反转网络(GANs)往往在数据元的低密度区域附近生产低质量的样品,特别是对于有细小特征的样品。许多技术的开发是为了通过处理后产生的样品或通过经验性数据分配的预处理来提高所生成样品的质量,但代价却是减少多样性。为了在不降低总体质量的情况下促进样品的多样化,我们提出了一个简单而有效的方法来诊断和强调在GAN培训中代表性不足的样品。主要想法是利用关于数据分布与每个实例的模型分布之间差异的统计数据。我们根据关于代表性不足的样品在GAN培训中存在较高平均差异或差异性强的观察,建议了一种方法来强调这些样品的质量。我们的实验结果表明,拟议的方法提高了GAN在各种数据集上的性能,而且对于提高具有小特征的样品的质量和多样性特别有效。