Spatio-temporal action detection is an important and challenging problem in video understanding. However, the application of the existing large-scale spatio-temporal action datasets in specific fields is limited, and there is currently no public tool for making spatio-temporal action datasets, it takes a lot of time and effort for researchers to customize the spatio-temporal action datasets, so we propose a multi-Person video dataset Annotation Method of spatio-temporally actions.First, we use ffmpeg to crop the videos and frame the videos; then use yolov5 to detect human in the video frame, and then use deep sort to detect the ID of the human in the video frame. By processing the detection results of yolov5 and deep sort, we can get the annotation file of the spatio-temporal action dataset to complete the work of customizing the spatio-temporal action dataset.
翻译:Spatio- 时空动作探测是视频理解中一个重要且具有挑战性的问题。 但是,在特定领域应用现有的大型时空动作数据集是有限的,目前没有制作时空动作数据集的公共工具,研究人员需要花很多时间和精力来定制时空动作数据集,因此我们建议采用多人视频数据集的spatio- 时空动作识别方法。 首先,我们使用软模来裁剪视频并架设视频框;然后使用Yolov5来在视频框中探测人,然后用深度的某种工具在视频框中探测人的身份。通过处理 Yolov5 的检测结果和深度排序,我们可以获得波地时动作数据集的批注文件,以完成调制时空动作数据集的工作。