长尾图像识别的反图像频率 (Inverse Image Frequency for Long-tailed Image Recognition)

The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8% top-1 accuracy with ResNet50 on ImageNet-LT and 26.2% segmentation AP with MaskRCNN on LVIS. Code available at https://github.com/kostas1515/iif

翻译：长尾的分布是真实世界中常见的现象。提取的大型图像数据集必然会显示长期老化的属性和用不平衡数据培训的模型对于代表人数过多的类别来说可以取得高性能,但对于代表人数不足的类别来说却是难以达到的。为了应对这一挑战,我们建议采用名为反图像频率(IIIF)的新颖的脱偏见方法。 IIF是一种倍增性差差差幅调整转换法,对动态神经网络分类层的日志进行了倍增性差差差差调整。我们的方法比类似的工作取得更强的性能,而且对于诸如长尾切的例分解等下游任务特别有用,因为它会产生较少的假阳性检测。我们的广泛实验表明, IIF超过了图像网络-LT、CIFAR-LT、Plocks-LT和LVIS等许多长期基准的艺术状态,达到55.8%的最高精确度,在图像网-LT和与MacRCNN的Ress://github.com/kostashifii51515/fifi,在LVS上可以查阅到的ResNet的ResNet50和26.2%的AP。