This paper presents our solution to ACM MM challenge: Large-scale Human-centric Video Analysis in Complex Events\cite{lin2020human}; specifically, here we focus on Track3: Crowd Pose Tracking in Complex Events. Remarkable progress has been made in multi-pose training in recent years. However, how to track the human pose in crowded and complex environments has not been well addressed. We formulate the problem as several subproblems to be solved. First, we use a multi-object tracking method to assign human ID to each bounding box generated by the detection model. After that, a pose is generated to each bounding box with ID. At last, optical flow is used to take advantage of the temporal information in the videos and generate the final pose tracking result.
翻译:本文件介绍了我们对AMM M挑战的解决方案:复杂事件中的大规模以人为中心的视频分析;具体来说,我们在此侧重于轨道3:复杂事件中的群鼠跟踪;近年来在多点培训方面取得了显著进展;然而,如何追踪在拥挤和复杂环境中的人的构成问题没有很好地解决。我们把问题描述为需要解决的几个子问题。首先,我们使用多点跟踪方法为探测模型产生的每个捆绑框指定人的身份。此后,每个带ID的捆绑盒都会产生一个外形。最后,光学流被用来利用视频中的时间信息并产生最终的外形跟踪结果。