One of the main challenges for feature representation in deep learning-based classification is the design of appropriate loss functions that exhibit strong discriminative power. The classical softmax loss does not explicitly encourage discriminative learning of features. A popular direction of research is to incorporate margins in well-established losses in order to enforce extra intra-class compactness and inter-class separability, which, however, were developed through heuristic means, as opposed to rigorous mathematical principles. In this work, we attempt to address this limitation by formulating the principled optimization objective as learning towards the largest margins. Specifically, we firstly define the class margin as the measure of inter-class separability, and the sample margin as the measure of intra-class compactness. Accordingly, to encourage discriminative representation of features, the loss function should promote the largest possible margins for both classes and samples. Furthermore, we derive a generalized margin softmax loss to draw general conclusions for the existing margin-based losses. Not only does this principled framework offer new perspectives to understand and interpret existing margin-based losses, but it also provides new insights that can guide the design of new tools, including sample margin regularization and largest margin softmax loss for the class-balanced case, and zero-centroid regularization for the class-imbalanced case. Experimental results demonstrate the effectiveness of our strategy on a variety of tasks, including visual classification, imbalanced classification, person re-identification, and face verification.
翻译:在深层次学习分类中,特征代表性面临的主要挑战之一是设计具有强烈歧视力量的适当损失功能。古典软体损失并不明确鼓励歧视性特征的学习。流行的研究方向是,在既定损失中包括利润幅度,以强制实施超级级内紧凑和阶级间分离,然而,这是通过粗略手段而不是严格的数学原则开发的。在这项工作中,我们试图通过将原则优化目标作为学习最大的利润幅度来应对这一限制。具体地说,我们首先将阶级差值定义为衡量阶级间分离的尺度,而将抽样差幅定义为衡量阶级内部紧凑的尺度。因此,为了鼓励区别性代表性,损失功能应促进尽可能最大的等级和样本间分离。此外,我们得出普遍差值软体损失,以得出现有基于利润的损失的一般性结论。我们不仅试图通过制定原则性优化目标来理解和解释现有基于利润的损失,而且还提供了新的见解,能够指导新工具的设计,包括抽样差值调整和最大比例差值差值的比值差度,从而显示我们等级和最软面性稳定性结果分类的分类。