Class imbalance is a fundamental problem in computer vision applications such as semantic segmentation. Specifically, uneven class distributions in a training dataset often result in unsatisfactory performance on under-represented classes. Many works have proposed to weight the standard cross entropy loss function with pre-computed weights based on class statistics, such as the number of samples and class margins. There are two major drawbacks to these methods: 1) constantly up-weighting minority classes can introduce excessive false positives in semantic segmentation; 2) a minority class is not necessarily a hard class. The consequence is low precision due to excessive false positives. In this regard, we propose a hard-class mining loss by reshaping the vanilla cross entropy loss such that it weights the loss for each class dynamically based on instantaneous recall performance. We show that the novel recall loss changes gradually between the standard cross entropy loss and the inverse frequency weighted loss. Recall loss also leads to improved mean accuracy while offering competitive mean Intersection over Union (IoU) performance. On Synthia dataset, recall loss achieves $9\%$ relative improvement on mean accuracy with competitive mean IoU using DeepLab-ResNet18 compared to the cross entropy loss. Code available at https://github.com/PotatoTian/recall-semseg.
翻译:类中不平衡是诸如语义分割等计算机视觉应用中的一个基本问题。 具体地说, 培训数据集中的阶级分布不均往往导致代表性不足的阶级表现不尽人意。 许多作品提议根据类类统计,如抽样和阶级利润幅度,以预先计算重量来加权标准交叉输损功能,例如样本数量和阶级利润幅度。 这些方法有两个主要的缺点:(1) 不断提高的少数民族阶级在语义分割中可能会引入过度的假正数;(2) 少数群体阶级不一定是硬阶级。 结果是过度的假正数导致的精确度低。 在这方面,我们提议通过重塑香草交叉输油机损失,使它根据瞬间召回的绩效,以动态方式加权每个阶级的损失。我们表明,新颖的重记损失在标准交叉输油率损失和反频率加权损失之间逐渐出现变化。 重新点损失还导致提高平均值的准确性,同时提供具有竞争力的内科(IOU)表现。 在Synthia数据设置中, 回收损失在具有竞争力的ILAGLAV18/MARECRECRE18 上, 相对改进了可比较的准确性IADRABAS18。