In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos. Compared with 1) the popular unsupervised re-ID setting where the training and test sets are typically under the same domain, and 2) the popular domain generalization (DG) re-ID setting where the training samples are labeled, our novel scenario combines their key challenges: the training samples are unlabeled, and collected form various domains which do no align with the test domain. In other words, we aim to learn a representation in an unsupervised manner and directly use the learned representation for re-ID in novel domains. To fulfill this goal, we make two main contributions: First, we propose Cycle Association (CycAs), a scalable self-supervised learning method for re-ID with low training complexity; and second, we construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method. Specifically, CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs, and the training cost is merely linear to the data size, making large-scale training possible. On the other hand, the LMP-video dataset is extremely large, containing 50 million unlabeled person images cropped from over 10K Youtube videos, therefore is sufficient to serve as fertile soil for self-supervised learning. Trained on LMP-video, we show that CycAs learns good generalization towards novel domains. The achieved results sometimes even outperform supervised domain generalizable models. Remarkably, CycAs achieves 82.2% Rank-1 on Market-1501 and 49.0% Rank-1 on MSMT17 with zero human annotation, surpassing state-of-the-art supervised DG re-ID methods. Moreover, we also demonstrate the superiority of CycAs under the canonical unsupervised re-ID and the pretrain-and-finetune scenarios.
翻译:在本文中,我们有兴趣从未贴标签的视频中学习一个通用的人重新定位(re-ID) 。 与 1 相比, 受欢迎的未经监督的重新定位设置, 培训和测试组通常在同一域内, 和 2 受欢迎的域通用( DG) 重新定位设置, 培训样本被贴上标签, 我们的新设想结合了他们的关键挑战: 培训样本没有标签, 收集了与测试域不相符的多个域。 换句话说, 我们的目标是以不受监督的方式学习一个不受监督的再定位( re-ID), 直接使用所学的再定位。 为了实现这一目标, 我们做出了两大主要贡献: 第一, 我们提出循环协会( CycA), 一个可升级的自我监控视频集成( DGD), 而培训的缩略图为L- mal- droad C; 第二, 我们建立一个大型的无标签的重新定位的重新定位数据集 。 因此, CycA 可以在拟议的方法上, 实现循环化的功能的特性, 在连续的Cnal- dal- real- real- realde 上, 在连续的连续的 Clade Flade 上, 展示中, 也只是一个大的数据显示一个成本 。