Recent studies show that models trained on synthetic datasets are able to achieve better generalizable person re-identification (GPReID) performance than that trained on public real-world datasets. On the other hand, due to the limitations of real-world person ReID datasets, it would also be important and interesting to use large-scale synthetic datasets as test sets to benchmark person ReID algorithms. Yet this raises a critical question: is synthetic dataset reliable for benchmarking generalizable person re-identification? In the literature there is no evidence showing this. To address this, we design a method called Pairwise Ranking Analysis (PRA) to quantitatively measure the ranking similarity and perform the statistical test of identical distributions. Specifically, we employ Kendall rank correlation coefficients to evaluate pairwise similarity values between algorithm rankings on different datasets. Then, a non-parametric two-sample Kolmogorov-Smirnov (KS) test is performed for the judgement of whether algorithm ranking correlations between synthetic and real-world datasets and those only between real-world datasets lie in identical distributions. We conduct comprehensive experiments, with ten representative algorithms, three popular real-world person ReID datasets, and three recently released large-scale synthetic datasets. Through the designed pairwise ranking analysis and comprehensive evaluations, we conclude that a recent large-scale synthetic dataset ClonedPerson can be reliably used to benchmark GPReID, statistically the same as real-world datasets. Therefore, this study guarantees the usage of synthetic datasets for both source training set and target testing set, with completely no privacy concerns from real-world surveillance data. Besides, the study in this paper might also inspire future designs of synthetic datasets.
翻译:最近的研究显示,在合成数据集方面受过培训的模型能够比在公共现实世界数据集方面受过培训的模型取得更普遍的人重新识别(GPReID)性能。另一方面,由于真实世界人 ReID 数据集的局限性,使用大规模合成数据集作为基准人 ReID 算法的测试组,也是很重要的和有趣的。然而,这提出了一个关键问题:合成数据集对于基准一般人重新识别而言是否可靠?在文献中没有证据表明这一点。为了解决这个问题,我们设计了一种叫Pairwike 排序分析(PRA)的方法,以量化地测量相似性,并进行相同分布分布的统计测试。具体地说,我们使用Kendall 排序相关系数来评估不同数据集之间的对等性值。然后,进行一个非参数性的2 sample Kolmogorov-Smirnov (KS) 测试,以判断合成和真实世界数据集的排序是否与真实数据序列的对应关系。我们用这个模型进行的全面数据序列分析,我们用3个数据序列来进行数据测试。