Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS. In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tail categories receive more discouraging gradients. Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories. The equalization loss protects the learning of rare categories from being at a disadvantage during the network parameter updating. Thus the model is capable of learning better discriminative features for objects of rare classes. Without any bells and whistles, our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark, compared to the Mask R-CNN baseline. With the utilization of the effective equalization loss, we finally won the 1st place in the LVIS Challenge 2019. Code has been made available at: https: //github.com/tztztztztz/eql.detectron2
翻译:使用进化神经网络(CNN)的物体识别技术取得了巨大成功,然而,在大型词汇和长尾数据集(如LVIS)方面,最先进的物体检测方法仍然表现不佳。在这项工作中,我们从新的角度分析这一问题:一个类别的每个正面样本都可以被视为其他类别的负面样本,使尾品类别获得更令人沮丧的梯度。在此基础上,我们提议一种简单而有效的损失,称为均分损失,通过简单地忽略稀有类别中的梯度来解决长尾稀有类别的问题。在网络参数更新期间,衡平准性损失保护稀有类别学习不会处于不利地位。因此,该模型能够学习到稀有类别中的物体的更好的歧视特征。在没有任何铃声和哨声的情况下,我们的方法在具有挑战性的LVIS基准的稀有和常见类别中取得了4.1%和4.8%的收益。与Make R-CNN基准相比,我们提出了一种称为均分损失的简单但很少类别问题。我们终于赢得了LVIS挑战2019/stzzrequen: http: http:https:/ctromptrus/st/stzetqreal.