Class imbalance is an inherent problem in many machine learning classification tasks. This often leads to trained models that are unusable for any practical purpose. In this study we explore an unsupervised approach to address these imbalances by leveraging transfer learning from pre-trained image classification models to encoder-based Generative Adversarial Network (eGAN). To the best of our knowledge, this is the first work to tackle this problem using GAN without needing to augment with synthesized fake images. In the proposed approach we use the discriminator network to output a negative or positive score. We classify as minority, test samples with negative scores and as majority those with positive scores. Our approach eliminates epistemic uncertainty in model predictions, as the P(minority) + P(majority) need not sum up to 1. The impact of transfer learning and combinations of different pre-trained image classification models at the generator and discriminator is also explored. Best result of 0.69 F1-score was obtained on CIFAR-10 classification task with imbalance ratio of 1:2500. Our approach also provides a mechanism of thresholding the specificity or sensitivity of our machine learning system. Keywords: Class imbalance, Transfer Learning, GAN, nash equilibrium
翻译:分类不平衡是许多机器学习分类任务的一个固有问题。 这往往导致经过培训的模型无法用于任何实际目的。 在这项研究中,我们探索一种不受监督的方法,通过将预培训图像分类模型的学习转移至基于编码器的基因反转网络(eGAN)来消除这些不平衡。根据我们的最佳知识,这是在不需要使用合成假图像来增加合成假图像的情况下使用GAN来解决这一问题的首项工作。在拟议的方法中,我们使用歧视者网络来输出负分或正分。我们将负分样本和正分多数样本归类为少数,测试结果为负分和正分。我们的方法消除模型预测中的隐含不确定性,因为P(minority)+P(占多数)不一定等于1。在生成者和歧视者中,也探讨了转移学习前不同图像分类模型的学习和组合的影响。在CIFAR-10的分类工作中获得的0.69 F1分最佳结果,其不平衡比率为1:2500。我们的方法还提供了一个将机器学习系统的具体性或灵敏性机制, KeyA: Clas: Gleglegal: Claction: Gleg