Human actions in videos are 3D signals. However, there are a few methods available for multiple human action recognition. For long videos, it's difficult to search within a video for a specific action and/or person. For that, this paper proposes a new technic for multiple human action recognition and summarization for surveillance videos. The proposed approach proposes a new representation of the data by extracting the sequence of each person from the scene. This is followed by an analysis of each sequence to detect and recognize the corresponding actions using 3D convolutional neural networks (3DCNNs). Action-based video summarization is performed by saving each person's action at each time of the video. Results of this work revealed that the proposed method provides accurate multi human action recognition that easily used for summarization of any action. Further, for other videos that can be collected from the internet, which are complex and not built for surveillance applications, the proposed model was evaluated on some datasets like UCF101 and YouTube without any preprocessing. For this category of videos, the summarization is performed on the video sequences by summarizing the actions in each subsequence. The results obtained demonstrate its efficiency compared to state-of-the-art methods.
翻译:视频中的人类动作是 3D 信号。 然而, 人类动作在视频中的信号是 3D 。 然而, 有一些方法可以使用多重人类动作识别 。 对于长视频, 很难在视频中搜索特定动作和/ 或人。 为此, 本文建议了一个新的人类动作识别技术, 并概述了监视视频。 提议的方法建议通过从现场提取每个人的序列来对数据进行新的描述。 之后对每个序列进行分析, 以便利用 3D 进化神经网络( 3DCNNSs) 检测和识别相应的动作 。 基于行动的视频总结是通过在视频的每个时间保存每个人的行动来完成的。 这项工作的结果显示, 拟议的方法提供了精确的多人动作识别, 容易用于任何动作的汇总 。 此外, 对于从互联网上收集的其他视频, 复杂且不用于监视应用程序的, 提议的模型在不作任何预处理的情况下对一些数据集( UCFCF101 和YouTube) 进行了评估。 对于这一类型的视频,, 以视频序列的汇总方式进行总结, 通过总结每个子集中的行动结果显示其效率 。