Class-imbalanced data, in which some classes contain far more samples than others, is ubiquitous in real-world applications. Standard techniques for handling class-imbalance usually work by training on a re-weighted loss or on re-balanced data. Unfortunately, training overparameterized neural networks on such objectives causes rapid memorization of minority class data. To avoid this trap, we harness meta-learning, which uses both an ''outer-loop'' and an ''inner-loop'' loss, each of which may be balanced using different strategies. We evaluate our method, MetaBalance, on image classification, credit-card fraud detection, loan default prediction, and facial recognition tasks with severely imbalanced data, and we find that MetaBalance outperforms a wide array of popular re-sampling strategies.
翻译:分类平衡数据,有些班级的样本远远多于其他班级,在现实应用中普遍存在。处理班级平衡的标准技术通常通过重新加权损失或重新平衡数据的培训来工作。不幸的是,对关于这些目标的过度量化神经网络的培训导致少数群体类数据快速记忆化。为了避免这一陷阱,我们利用元学习,既使用“外环”又使用“内环”损失,每种损失都可能使用不同的战略加以平衡。我们用严重不平衡的数据评估我们的方法、MetaBalance、图像分类、信用卡欺诈检测、贷款违约预测和面部识别任务,我们发现MetaBalance超越了广泛的大众再抽样战略。