Despite the constant advances in computer vision, integrating modern single-image detectors in real-time handgun alarm systems in video-surveillance is still debatable. Using such detectors still implies a high number of false alarms and false negatives. In this context, most existent studies select one of the latest single-image detectors and train it on a better dataset or use some pre-processing, post-processing or data-fusion approach to further reduce false alarms. However, none of these works tried to exploit the temporal information present in the videos to mitigate false detections. This paper presents a new system, called MULTI Confirmation-level Alarm SysTem based on Convolutional Neural Networks (CNN) and Long Short Term Memory networks (LSTM) (MULTICAST), that leverages not only the spacial information but also the temporal information existent in the videos for a more reliable handgun detection. MULTICAST consists of three stages, i) a handgun detection stage, ii) a CNN-based spacial confirmation stage and iii) LSTM-based temporal confirmation stage. The temporal confirmation stage uses the positions of the detected handgun in previous instants to predict its trajectory in the next frame. Our experiments show that MULTICAST reduces by 80% the number of false alarms with respect to Faster R-CNN based-single-image detector, which makes it more useful in providing more effective and rapid security responses.
翻译:尽管在计算机视野方面不断取得进展,将现代单一图像探测器纳入视频监视器的实时手枪警报系统仍然值得商榷。使用这种探测器仍然意味着大量虚假警报和假阴差。在这方面,大多数现有研究选择了最新的单一图像探测器之一,并用更精确的数据集来训练它,或者使用一些预处理、后处理或数据聚合方法来进一步减少虚假警报。然而,这些作品都没有试图利用视频中现有的时间信息来减少虚假的探测。本文介绍了一个新的系统,称作以革命神经网络(CNN)和长期短期记忆网络(IMUCAST)为基础的多边确认级警报系统,该系统不仅利用了较新的单一图像探测器,而且还利用了视频中存在的时间信息,以便更可靠地探测手枪。MITCASAT由三个阶段组成,一)有用的探测阶段,基于CNNCMS的确认阶段,三)基于LSTMN的时间确认阶段。时间确认阶段利用了我们的时间确认阶段,更准确地预测了我们以前探查的80个探空轨道,在SAR框架中以更准确地展示了我们以前探查的探查的探查的轨道。