Real-world visual recognition problems often exhibit long-tailed distributions, where the amount of data for learning in different categories shows significant imbalance. Standard classification models learned on such data distribution often make biased predictions towards the head classes while generalizing poorly to the tail classes. In this paper, we present two effective modifications of CNNs to improve network learning from long-tailed distribution. First, we present a Class Activation Map Calibration (CAMC) module to improve the learning and prediction of network classifiers, by enforcing network prediction based on important image regions. The proposed CAMC module highlights the correlated image regions across data and reinforces the representations in these areas to obtain a better global representation for classification. Furthermore, we investigate the use of normalized classifiers for representation learning in long-tailed problems. Our empirical study demonstrates that by simply scaling the outputs of the classifier with an appropriate scalar, we can effectively improve the classification accuracy on tail classes without losing the accuracy of head classes. We conduct extensive experiments to validate the effectiveness of our design and we set new state-of-the-art performance on five benchmarks, including ImageNet-LT, Places-LT, iNaturalist 2018, CIFAR10-LT, and CIFAR100-LT.
翻译:在不同类别中学习的数据数量显示出显著的不平衡; 在这些数据分布方面获得的标准分类模型往往对头类作出有偏见的预测,同时对尾类进行概括化的预测; 在本文中,我们介绍了对CNN的两种有效的修改,以改进从长尾分发中学习的网络; 首先,我们展示了一种分类活化地图校准模块,通过根据重要图像区域实施网络预测,改进网络分类的学习和预测; 拟议的CAM模块突出数据之间的相关图像区域,并加强了这些领域的表述,以获得更好的全球分类代表性; 此外,我们调查了在长期问题中使用正常分类方法进行代表性学习的情况; 我们的实证研究表明,只要用适当的标度来增加分类器的产出,就可以在不丧失头类准确性的情况下有效地提高尾类的分类准确性; 我们进行了广泛的实验,以验证我们的设计有效性,我们为五个基准设定了新的状态,包括图像网络、100-LT、IARTRT、CIARLT、CIAR-118、CIARLT和CIARLT。