The current evaluation protocol of long-tailed visual recognition trains the classification model on the long-tailed source label distribution and evaluates its performance on the uniform target label distribution. Such protocol has questionable practicality since the target may also be long-tailed. Therefore, we formulate long-tailed visual recognition as a label shift problem where the target and source label distributions are different. One of the significant hurdles in dealing with the label shift problem is the entanglement between the source label distribution and the model prediction. In this paper, we focus on disentangling the source label distribution from the model prediction. We first introduce a simple baseline method that matches the target label distribution by post-processing the model prediction trained by the cross-entropy loss and the Softmax function. Although this method surpasses state-of-the-art methods on benchmark datasets, it can be further improved by directly disentangling the source label distribution from the model prediction in the training phase. Thus, we propose a novel method, LAbel distribution DisEntangling (LADE) loss based on the optimal bound of Donsker-Varadhan representation. LADE achieves state-of-the-art performance on benchmark datasets such as CIFAR-100-LT, Places-LT, ImageNet-LT, and iNaturalist 2018. Moreover, LADE outperforms existing methods on various shifted target label distributions, showing the general adaptability of our proposed method.
翻译:长尾目目辨识的当前评价协议对长尾源标签分布的分类模式进行了培训,并评价其在统一目标标签分布方面的性能。这种协议的实用性令人怀疑,因为目标也可能是长尾目分布。因此,我们将长尾目辨识作为一种标签转换问题,因为目标和源标签分布不同。在处理标签转换问题方面的一大障碍是源标签分布与模型预测之间的纠缠。在本文中,我们侧重于使源标签分布与模型预测脱钩。我们首先采用一种简单的基线方法,与目标标签分布相匹配,方法是通过处理后处理由交叉作物损失和软体函数培训的模型预测。尽管这种方法超过了基准数据集分布方面的最新方法,但是如果直接将源标签分布与培训阶段的模型预测脱钩,则可以进一步改进。 因此,我们提出了一种新颖的方法,即源标签分布与模型分布不相匹配。 我们首先采用了一种与目标标签-Varadhan标签分布匹配的标签分配相匹配的简单基准方法,在Donsker-Varadhan 目标分布中,在基准数据集上实现了现有基准值排序。