Empowering automated violence monitoring and surveillance systems amid the growing social violence and extremist activities worldwide could keep communities safe and save lives. The questionable reliability of human monitoring personnel and the increasing number of surveillance cameras makes automated artificial intelligence-based solutions compelling. Improving the current state-of-the-art deep learning approaches to video violence recognition to higher levels of accuracy and performance could enable surveillance systems to be more reliable and scalable. The main contribution of the proposed deep reinforcement learning method is to achieve state-of-the-art accuracy on RWF, Hockey, and Movies datasets while removing some of the computationally expensive processes and input features used in the previous solutions. The implementation of hard attention using a semi-supervised learning method made the proposed method capable of rough violence localization and added increased agent interpretability to the violence detection system.
翻译:在世界各地社会暴力和极端主义活动不断增多的情况下,增强自动暴力监测和监视系统的能力,可以确保社区的安全并拯救生命。由于人类监测人员的可靠性令人怀疑,而且监视摄像头数量不断增加,自动人工智能解决方案具有说服力。改进目前最先进的视频暴力识别深层学习方法,提高视频暴力的准确度和性能,可以使监视系统更加可靠和可扩展。拟议的深层强化学习方法的主要贡献是实现对RWF、Hokey和电影数据集的最新准确性,同时消除以往解决方案中所使用的一些计算成本昂贵的程序和输入特征。采用半监督学习方法进行硬性关注,使得拟议的方法能够粗暴力地方化,并增加了暴力探测系统的可解释性。