Unsupervised Person Re-identification (U-ReID) with pseudo labeling recently reaches a competitive performance compared to fully-supervised ReID methods based on modern clustering algorithms. However, such clustering-based scheme becomes computationally prohibitive for large-scale datasets. How to efficiently leverage endless unlabeled data with limited computing resources for better U-ReID is under-explored. In this paper, we make the first attempt to the large-scale U-ReID and propose a "small data for big task" paradigm dubbed Meta Clustering Learning (MCL). MCL only pseudo-labels a subset of the entire unlabeled data via clustering to save computing for the first-phase training. After that, the learned cluster centroids, termed as meta-prototypes in our MCL, are regarded as a proxy annotator to softly annotate the rest unlabeled data for further polishing the model. To alleviate the potential noisy labeling issue in the polishment phase, we enforce two well-designed loss constraints to promise intra-identity consistency and inter-identity strong correlation. For multiple widely-used U-ReID benchmarks, our method significantly saves computational cost while achieving a comparable or even better performance compared to prior works.
翻译:使用假标签的无人监督者再识别( U- ReID), 与基于现代群集算法的完全监督的ReID 方法相比, 近来取得了一种竞争性的性能。 然而, 这种基于集群的机制在计算上变得对大型数据集来说是极其难以使用的。 如何有效地利用有限计算资源的无标签数据来改善 U- ReID 的探索不足。 在本文中, 我们第一次尝试大规模 U- ReID, 并提出“ 大任务小型数据” 模式, 被称作 Meta 集成学习( MMCL ) 。 MCL 仅通过集群来保存第一阶段培训的计算机, 将整个未标有标签的数据的一部分贴上伪标签 。 之后, 学习的集成型机器人, 在我们的MCL 中被称为元件模型, 被视作一个代理说明者, 软化地点注出其余无标签的数据, 以便进一步擦亮模型。 为了减轻在擦亮阶段潜在的噪音标签问题, 我们实施了两个精心设计的损失限制, 以许诺内位一致性和间关系密切的关联性关系。 。 比较之前的UID, 实现多种广泛使用的业绩计算方法, 比较成本 。