Pedestrian attribute recognition aims to assign multiple attributes to one pedestrian image captured by a video surveillance camera. Although numerous methods are proposed and make tremendous progress, we argue that it is time to step back and analyze the status quo of the area. We review and rethink the recent progress from three perspectives. First, given that there is no explicit and complete definition of pedestrian attribute recognition, we formally define and distinguish pedestrian attribute recognition from other similar tasks. Second, based on the proposed definition, we expose the limitations of the existing datasets, which violate the academic norm and are inconsistent with the essential requirement of practical industry application. Thus, we propose two datasets, PETA\textsubscript{$ZS$} and RAP\textsubscript{$ZS$}, constructed following the zero-shot settings on pedestrian identity. In addition, we also introduce several realistic criteria for future pedestrian attribute dataset construction. Finally, we reimplement existing state-of-the-art methods and introduce a strong baseline method to give reliable evaluations and fair comparisons. Experiments are conducted on four existing datasets and two proposed datasets to measure progress on pedestrian attribute recognition.
翻译:Pedestrian属性识别旨在给一个由视频监视相机拍摄的行人图像分配多重属性。虽然提出了许多方法并取得了巨大进展,但我们认为,现在是从三个角度回顾和重新思考最近的进展的时候了。首先,鉴于行人属性识别没有明确和完整的定义,我们正式界定和区分行人属性识别与其他类似任务。第二,根据拟议定义,我们暴露了现有数据集的局限性,这些数据集违反了学术规范,不符合实际工业应用的基本要求。因此,我们提议了两个数据集,即PETA\textsuscript{ZS$}和RAp\textsubscram{ZS$},这是在行人身份零点设置后构建的。此外,我们还为未来行人属性数据集构建引入了几项现实的标准。最后,我们重新实施了现有的最新方法,并引入了强有力的基线方法,以提供可靠的评估和公平比较。我们对现有四个数据集进行了实验,并提出了两个数据集,以衡量行人属性识别的进展。