RSG:一个简单但有效的学习不平衡数据集模块 (RSG: A Simple but Effective Module for Learning Imbalanced Datasets)

from arxiv, To appear at CVPR 2021. We propose a flexible data generation/data augmentation module for long-tailed classification. Codes are available at: https://github.com/Jianf-Wang/RSG

Imbalanced datasets widely exist in practice and area great challenge for training deep neural models with agood generalization on infrequent classes. In this work, wepropose a new rare-class sample generator (RSG) to solvethis problem. RSG aims to generate some new samplesfor rare classes during training, and it has in particularthe following advantages: (1) it is convenient to use andhighly versatile, because it can be easily integrated intoany kind of convolutional neural network, and it works wellwhen combined with different loss functions, and (2) it isonly used during the training phase, and therefore, no ad-ditional burden is imposed on deep neural networks duringthe testing phase. In extensive experimental evaluations, weverify the effectiveness of RSG. Furthermore, by leveragingRSG, we obtain competitive results on Imbalanced CIFARand new state-of-the-art results on Places-LT, ImageNet-LT, and iNaturalist 2018. The source code is available at https://github.com/Jianf-Wang/RSG.

翻译：在实践和领域,在培训深度神经模型方面广泛存在平衡的数据集,对不常见的班级进行良好的概括化培训,在实践和领域都存在着巨大的挑战。在这项工作中,我们提出一个新的稀有级样本生成器(RSG)来解决这个问题。RSG的目的是在培训期间为稀有班级制作一些新的样本,它尤其具有以下优势:(1) 方便使用,而且具有高度的多功能性,因为它可以很容易地融入任何种类的革命神经网络,当它与不同的损失功能相结合时,它运作良好。(2) 它只在培训阶段使用,因此,在测试阶段,没有给深层神经网络强加任何适应性负担。在广泛的实验评估中,对REG的有效性进行核查。此外,我们通过利用RSG,在Immfarand Lations-LT、图像网-LT和iNatallist 2018上取得平衡的CIFAR和新状态艺术成果方面的竞争结果。源代码见https://github.com/Jianf-Wang/RSG。