The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary. For instance, when training with cross-entropy loss, examples with higher likelihoods (i.e., well-classified examples) contribute smaller gradients in back-propagation. However, we theoretically show that this common practice hinders representation learning, energy optimization, and the growth of margin. To counteract this deficiency, we propose to reward well-classified examples with additive bonuses to revive their contribution to learning. This counterexample theoretically addresses these three issues. We empirically support this claim by directly verify the theoretical results or through the significant performance improvement with our counterexample on diverse tasks, including image classification, graph classification, and machine translation. Furthermore, this paper shows that because our idea can solve these three issues, we can deal with complex scenarios, such as imbalanced classification, OOD detection, and applications under adversarial attacks. Code is available at: https://github.com/lancopku/well-classified-examples-are-underestimated.
翻译:学习深层次分类模式的传统智慧是侧重于错误分类实例,忽视远离决定界限的分类实例。例如,在交叉湿度损失培训中,高可能性(如分类范例)的例子(如,高可能性)有助于后方宣传中的梯度较小。然而,我们理论上表明,这种常见做法阻碍代表性学习、能源优化和利润增长。为弥补这一缺陷,我们提议奖励高等级实例,并发放添加奖赏,以恢复其对学习的贡献。这个反实例理论上解决了这三个问题。我们通过直接核查理论结果或通过显著改进绩效,与我们关于不同任务的对应实例,包括图像分类、图表分类和机器翻译,从经验上支持这一主张。此外,本文表明,由于我们的想法可以解决这三个问题,我们可以处理复杂的情景,如不平衡的分类、OOD检测和对抗性攻击下的应用。代码见:https://github.com/lancopku/well-clair-examples-areder。