Existing public person Re-Identification~(ReID) datasets are small in modern terms because of labeling difficulty. Although unlabeled surveillance video is abundant and relatively easy to obtain, it is unclear how to leverage these footage to learn meaningful ReID representations. In particular, most existing unsupervised and domain adaptation ReID methods utilize only the public datasets in their experiments, with labels removed. In addition, due to small data sizes, these methods usually rely on fine tuning by the unlabeled training data in the testing domain to achieve good performance. Inspired by the recent progress of large-scale self-supervised image classification using contrastive learning, we propose to learn ReID representation from large-scale unlabeled surveillance video alone. Assisted by off-the-shelf pedestrian detection tools, we apply the contrastive loss at both the image and the tracklet levels. Together with a principal component analysis step using camera labels freely available, our evaluation using a large-scale unlabeled dataset shows far superior performance among unsupervised methods that do not use any training data in the testing domain. Furthermore, the accuracy improves with the data size and therefore our method has great potential with even larger and more diversified datasets.
翻译:由于标签困难,现有公众身份重新识别~(ReID)数据集在现代方面是很小的,因为标签困难。虽然未贴标签的监视录像内容丰富,而且相对容易获取,但尚不清楚如何利用这些录像来学习有意义的 ReID 显示方式。特别是,大多数现有的未经监督和域适应方法仅使用实验中的公共数据集,并去除标签。此外,由于数据规模小,这些方法通常依靠测试域中未贴标签的培训数据进行微调,以取得良好的性能。受最近利用对比性学习大规模自我监督图像分类的进展的启发,我们提议仅从大规模无标签的监视录像中学习 ReID 表示方式。在现成行人探测工具的协助下,我们在图像和轨道级别上都采用对比性损失的方法。此外,主要组成部分分析步骤使用免费的相机标签,我们使用大型未贴标签的数据集进行的评估显示,在不使用甚至测试域内任何培训数据的未加校准方法中,其业绩要高得多。此外,精确性也随着数据规模和潜力的扩大而提高。