Unsupervised person re-identification is a challenging and promising task in the computer vision. Nowadays unsupervised person re-identification methods have achieved great improvements by training with pseudo labels. However, the appearance and label noise are less explicitly studied in the unsupervised manner. To relieve the effects of appearance noise the global features involved, we also take into account the features from two local views and produce multi-scale features. We explore the knowledge distillation to filter label noise, Specifically, we first train a teacher model from noisy pseudo labels in a iterative way, and then use the teacher model to guide the learning of our student model. In our setting, the student model could converge fast in the supervision of the teacher model thus reduce the interference of noisy labels as the teacher model greatly suffered. After carefully handling the noises in the feature learning, Our multi-scale knowledge distillation are proven to be very effective in the unsupervised re-identification. Extensive experiments on three popular person re-identification datasets demonstrate the superiority of our method. Especially, our approach achieves a state-of-the-art accuracy 85.7% @mAP or 94.3% @Rank-1 on the challenging Market-1501 benchmark with ResNet-50 under the fully unsupervised setting.
翻译:在计算机视野中,未经监督的人重新定位是一项富有挑战性和有希望的任务。 如今,未经监督的人重新身份鉴定方法通过使用假标签培训取得了巨大的改进。 但是, 外观和标签噪音在未经监督的情况下研究得不够明确。 为了减轻外观噪音的影响, 我们还考虑到当地两种观点的特征, 并产生多种规模的特征。 我们探索知识蒸馏以过滤标签噪音。 具体地说, 我们首先用迭接方式从吵闹的假标签中培养教师模型, 然后使用教师模型指导学生模型的学习。 在我们的设置中, 学生模型可以快速在教师模型的监管中聚集, 从而减少教师模型大量遭受的噪音标签干扰。 在认真处理特征学习中的噪音后, 我们的多尺度知识蒸馏证明在未经监督的重新定位中非常有效。 在三个受欢迎的人重新识别数据集上进行的广泛实验显示了我们的方法的优越性。 特别是, 我们的方法在以85.7%@mAP- 或 Exfirmal% 下, 在挑战性基准设置的 R-1 中实现了非状态的精确性标准。