Since their introduction in the last few years, conditional generative models have seen remarkable achievements. However, they often need the use of large amounts of labelled information. By using unsupervised conditional generation in conjunction with a clustering inference network, ClusterGAN has recently been able to achieve impressive clustering results. Since the real conditional distribution of data is ignored, the clustering inference network can only achieve inferior clustering performance by considering only uniform prior based generative samples. However, the true distribution is not necessarily balanced. Consequently, ClusterGAN fails to produce all modes, which results in sub-optimal clustering inference network performance. So, it is important to learn the prior, which tries to match the real distribution in an unsupervised way. In this paper, we propose self-augmentation information maximization improved ClusterGAN (SIMI-ClusterGAN) to learn the distinctive priors from the data directly. The proposed SIMI-ClusterGAN consists of four deep neural networks: self-augmentation prior network, generator, discriminator and clustering inference network. The proposed method has been validated using seven benchmark data sets and has shown improved performance over state-of-the art methods. To demonstrate the superiority of SIMI-ClusterGAN performance on imbalanced dataset, we have discussed two imbalanced conditions on MNIST datasets with one-class imbalance and three classes imbalanced cases. The results highlight the advantages of SIMI-ClusterGAN.
翻译:自引入以来,有条件生成模型在最近几年取得了显著的成就。然而,它们通常需要使用大量标记信息。通过将无监督条件生成与聚类推理网络结合使用,ClusterGAN最近能够实现了令人瞩目的聚类结果。然而,由于忽略了数据的真实条件分布,聚类推理网络只能通过考虑基于均匀先验的生成样本来实现较差的聚类性能。然而,真实分布不一定是均衡的。因此,ClusterGAN未能生成所有模式,这导致了次优的聚类推理网络性能。因此,在无监督的情况下学习先验很重要,以试图直接匹配真实分布。本文提出了一种名为自增信息最大化改进ClusterGAN(SIMI-ClusterGAN)的方法,以直接从数据中学习区分性先验。所提出的SIMI-ClusterGAN包括四个深度神经网络:自增先验网络、生成器、判别器和聚类推理网络。该方法已使用七个基准数据集进行验证,并显示出超越最先进方法的性能。为了演示SIMI-ClusterGAN对于不平衡数据集的优越性能,我们讨论了在MNIST数据集上的两种不平衡情况,即一类不平衡和三类不平衡情况。结果突出了SIMI-ClusterGAN的优势。