In this paper, we propose a new architecture for real-time anomaly detection in video data, inspired by human behavior combining spatial and temporal analyses. This approach uses two distinct models: (i) for temporal analysis, a recurrent convolutional network (CNN + RNN) is employed, associating VGG19 and a GRU to process video sequences; (ii) regarding spatial analysis, it is performed using YOLOv7 to analyze individual images. These two analyses can be carried out either in parallel, with a final prediction that combines the results of both analysis, or in series, where the spatial analysis enriches the data before the temporal analysis. Some experimentations are been made to compare these two architectural configurations with each other, and evaluate the effectiveness of our hybrid approach in video anomaly detection.
翻译:暂无翻译