How to improve discriminative feature learning is central in classification. Existing works address this problem by explicitly increasing inter-class separability and intra-class similarity, whether by constructing positive and negative pairs for contrastive learning or posing tighter class separating margins. These methods do not exploit the similarity between different classes as they adhere to i.i.d. assumption in data. In this paper, we embrace the real-world data distribution setting that some classes share semantic overlaps due to their similar appearances or concepts. Regarding this hypothesis, we propose a novel regularization to improve discriminative learning. We first calibrate the estimated highest likelihood of one sample based on its semantically neighboring classes, then encourage the overall likelihood predictions to be deterministic by imposing an adaptive exponential penalty. As the gradient of the proposed method is roughly proportional to the uncertainty of the predicted likelihoods, we name it adaptive discriminative regularization (ADR), trained along with a standard cross entropy loss in classification. Extensive experiments demonstrate that it can yield consistent and non-trivial performance improvements in a variety of visual classification tasks (over 10 benchmarks). Furthermore, we find it is robust to long-tailed and noisy label data distribution. Its flexible design enables its compatibility with mainstream classification architectures and losses.
翻译:如何改进歧视性特征的学习是分类的核心。现有的工作通过明确增加类际间分解和类内相似性来解决这一问题,无论是为对比性学习建造正对和负对对对正对对对正对对对正对正对对对对对,还是采取更紧的类间分差。这些方法并不利用数据中坚持i.i.d.假设的不同类别之间的相似性。在本文中,我们采用现实世界数据分配设置,即一些类别由于相似的外观或概念而共享语义重叠;关于这一假设,我们提议一个新的规范化,以改善歧视性学习。我们首先根据一个样本的语义相邻类校准其估计的最大可能性,然后鼓励总体可能性预测通过施加适应性指数性惩罚而具有决定性性。由于拟议方法的梯度与预测性可能性的不确定性大致成正比,因此我们将其命名为适应性歧视性规范化(ADR),在分类中经过培训,并带有标准的交叉性酶损失。关于这一假设的广泛实验表明,在各种视觉分类任务(超过10个基准)中,我们首先根据总体可能性预测性的可能性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性预测性,然后通过施加的预测性判断性判断性判断性判断性,然后鼓励,然后鼓励总,然后鼓励通过调整性指数性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性