Person search generally involves three important parts: person detection, feature extraction and identity comparison. However, person search integrating detection, extraction and comparison has the following drawbacks. Firstly, the accuracy of detection will affect the accuracy of comparison. Secondly, it is difficult to achieve real-time in real-world applications. To solve these problems, we propose a Multi-task Joint Framework for real-time person search (MJF), which optimizes the person detection, feature extraction and identity comparison respectively. For the person detection module, we proposed the YOLOv5-GS model, which is trained with person dataset. It combines the advantages of the Ghostnet and the Squeeze-and-Excitation (SE) block, and improves the speed and accuracy. For the feature extraction module, we design the Model Adaptation Architecture (MAA), which could select different network according to the number of people. It could balance the relationship between accuracy and speed. For identity comparison, we propose a Three Dimension (3D) Pooled Table and a matching strategy to improve identification accuracy. On the condition of 1920*1080 resolution video and 500 IDs table, the identification rate (IR) and frames per second (FPS) achieved by our method could reach 93.6% and 25.7,
翻译:个人搜寻通常包括三个重要部分:个人探测、特征提取和身份比较;然而,人搜寻结合探测、提取和比较,其缺点如下:第一,探测的准确性将影响比较的准确性;第二,很难在现实世界应用中实现实时应用;第二,为了解决这些问题,我们提议了一个实时人搜寻多任务联合框架(MJF),这个框架可以分别优化个人探测、特征提取和身份比较;关于个人探测模块,我们提议了YOLOv5-GS模型,该模型由个人数据集培训;它结合了Ghostnet和Squeze-Exament(SE)块的优点,并提高了速度和准确性;关于特征提取模块,我们设计了模型适应架构,可以根据人数选择不同的网络;它可以平衡准确性和速度之间的关系;关于身份比较,我们提议了一个三维(3D)组合表和一个匹配战略,以提高识别准确性;关于1920*1080分辨率视频和500IDs(S)的优点,以及提高速度和准确性能模块的进度;关于特征提取模块,我们设计为93-6%和框架,每个方法达到25。