Deep learning-based person Re-IDentification (ReID) often requires a large amount of training data to achieve good performance. Thus it appears that collecting more training data from diverse environments tends to improve the ReID performance. This paper re-examines this common belief and makes a somehow surprising observation: using more samples, i.e., training with samples from multiple datasets, does not necessarily lead to better performance by using the popular ReID models. In some cases, training with more samples may even hurt the performance of the evaluation is carried out in one of those datasets. We postulate that this phenomenon is due to the incapability of the standard network in adapting to diverse environments. To overcome this issue, we propose an approach called Domain-Camera-Sample Dynamic network (DCSD) whose parameters can be adaptive to various factors. Specifically, we consider the internal domain-related factor that can be identified from the input features, and external domain-related factors, such as domain information or camera information. Our discovery is that training with such an adaptive model can better benefit from more training samples. Experimental results show that our DCSD can greatly boost the performance (up to 12.3%) while joint training in multiple datasets.
翻译:深层学习型个人再识别(ReID)往往需要大量的培训数据才能取得良好的业绩。因此,从不同环境中收集更多的培训数据似乎有助于改进ReID的绩效。本文件重新审查了这一共同的信念,并提出了某种令人惊讶的观察:使用更多的样本,即利用来自多个数据集的样本,不一定通过使用流行的ReID模型而导致更好的业绩。在某些情况下,使用更多样本的培训甚至会损害评价的绩效,在其中一套数据集中进行。我们假定,这种现象是由于标准网络无法适应不同环境所致。为了克服这一问题,我们提出了一个称为Domain-Camera-Sample动态网络(DCSD)的方法,其参数可以适应各种因素。具体地说,我们考虑从输入特征和外部领域相关因素(如域信息或相机信息)中可以确定的内部领域相关因素。我们发现,使用这种适应型模型进行的培训可以从更多的培训样本中更好地受益。实验结果显示,我们的DCSDSDSD可以极大地提升业绩,同时进行多重培训(至12)。