The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary. For instance, when training with cross-entropy loss, examples with higher likelihoods (i.e., well-classified examples) contribute smaller gradients in back-propagation. However, we theoretically show that this common practice hinders representation learning, energy optimization, and the growth of margin. To counteract this deficiency, we propose to reward well-classified examples with additive bonuses to revive their contribution to learning. This counterexample theoretically addresses these three issues. We empirically support this claim by directly verify the theoretical results or through the significant performance improvement with our counterexample on diverse tasks, including image classification, graph classification, and machine translation. Furthermore, this paper shows that because our idea can solve these three issues, we can deal with complex scenarios, such as imbalanced classification, OOD detection, and applications under adversarial attacks.
翻译:学习深层次分类模型的传统智慧是侧重于错误分类实例,忽视远离决定界限的分类实例。例如,在涉及交叉热带损失的培训中,具有较高可能性的例子(例如分类的实例)有助于后方宣传中的梯度较小。然而,我们理论上表明,这种常见做法阻碍代表性学习、能源优化和利润增长。为了弥补这一缺陷,我们建议奖励那些具有添加剂奖金的分类实例,以恢复其对学习的贡献。从理论上讲,这一相反的例子解决了这三个问题。我们通过直接核查理论结果或通过在各种任务(包括图像分类、图表分类和机器翻译)方面的显著绩效改进,从经验上支持这一主张。此外,本文表明,由于我们的想法可以解决这三个问题,我们可以处理复杂的情景,例如不平衡的分类、OOD探测和在对抗性攻击下的应用。