Activity detection is one of the attractive computer vision tasks to exploit the video streams captured by widely installed cameras. Although achieving impressive performance, conventional activity detection algorithms are usually designed under certain constraints, such as using trimmed and/or object-centered video clips as inputs. Therefore, they failed to deal with the multi-scale multi-instance cases in real-world unconstrained video streams, which are untrimmed and have large field-of-views. Real-time requirements for streaming analysis also mark brute force expansion of them unfeasible. To overcome these issues, we propose Argus++, a robust real-time activity detection system for analyzing unconstrained video streams. The design of Argus++ introduces overlapping spatio-temporal cubes as an intermediate concept of activity proposals to ensure coverage and completeness of activity detection through over-sampling. The overall system is optimized for real-time processing on standalone consumer-level hardware. Extensive experiments on different surveillance and driving scenarios demonstrated its superior performance in a series of activity detection benchmarks, including CVPR ActivityNet ActEV 2021, NIST ActEV SDL UF/KF, TRECVID ActEV 2020/2021, and ICCV ROAD 2021.
翻译:尽管取得了令人印象深刻的绩效,但常规活动检测算法通常是在一定的限制下设计的,例如使用剪裁和/或以物体为中心的视频剪辑作为投入,因此,这些算法未能处理现实世界不受限制的视频流中的多规模多因子案件,这些多因子案件没有剪辑,而且有大量视野。流流分析的实时要求也标志着其无法实现的强力扩张。为了克服这些问题,我们提议了Argus++,这是一个强有力的实时活动检测系统,用于分析不受限制的视频流。Argus++的设计将重叠的spatio-时空立方体作为中间活动建议的概念,以确保通过过度取样探测活动的范围和完整性。整个系统在独立消费级硬件的实时处理方面得到了优化。关于不同监测和驱动情景的大规模实验表明其在一系列活动检测基准中的优异性表现,包括CVPRactNet AcEV 2021, NIST AGAD SDOLVRVA 2021, NIFAL ASG SDRVVRVA 2021, NBOLVVVL/RVRVRVAL AGRVRVA 2021, 2021, 20VVVVLVLVLSLVRVRVLSLMLVA, 20RVGLMLMLVAD AGRVLS 20/RVGRVGVGVGRVA 2021, 2021, 2021, 20RVGRVGVGVGVGVGVGVGVGLMLMLSLSLVAD 20RVAD 20RVLVLVLVA3, 2021, TRA3, 2021,20RVGVA3, 20RVGVGVGVKA3, 20RVGVGVGVGVGVGVGVGVGVA3, 20/FA3, 2021, 2021, 20RVA3, 20/FA3, 20/FA3, 20KFA3, 20RVLVA3, 20RVA3, 20