Person re-identification (re-id) remains challenging due to significant intra-class variations across different cameras. Recently, there has been a growing interest in using generative models to augment training data and enhance the invariance to input changes. The generative pipelines in existing methods, however, stay relatively separate from the discriminative re-id learning stages. Accordingly, re-id models are often trained in a straightforward manner on the generated data. In this paper, we seek to improve learned re-id embeddings by better leveraging the generated data. To this end, we propose a joint learning framework that couples re-id learning and data generation end-to-end. Our model involves a generative module that separately encodes each person into an appearance code and a structure code, and a discriminative module that shares the appearance encoder with the generative module. By switching the appearance or structure codes, the generative module is able to generate high-quality cross-id composed images, which are online fed back to the appearance encoder and used to improve the discriminative module. The proposed joint learning framework renders significant improvement over the baseline without using generated data, leading to the state-of-the-art performance on several benchmark datasets.
翻译:由于不同照相机之间不同类别的内部差异很大,重新定位(再定位)仍具有挑战性。最近,人们越来越有兴趣使用基因模型来增加培训数据和增强对投入变化的偏差。但是,现有方法中的基因管道相对地与歧视性重新定位学习阶段分离。因此,重新定位模型往往在生成的数据上直接接受培训。在本文件中,我们寻求通过更好地利用生成的数据来改进所学的重新嵌入。为此,我们提议了一个联合学习框架,让夫妇重新定位学习和数据生成最终到终端。我们的模型涉及一个基因模型,将每个人单独编码成外观代码和结构代码,以及一个与基因模型共享外观编码的歧视性模块。通过转换外观或结构代码,基因模块能够产生高质量的交叉组合图像,这些图像被在线反馈到外观编码中,用来改进歧视模块。拟议的联合学习框架在基线上进行了重大改进,而没有使用生成的数据,导致状态性业绩。