Temporal Intention Localization (TIL) is crucial for video surveillance, focusing on identifying varying levels of suspicious intentions to improve security monitoring. However, existing discrete classification methods fail to capture the continuous nature of suspicious intentions, limiting early intervention and explainability. In this paper, we propose the Suspicion Progression Analysis Network (SPAN), which shifts from discrete classification to continuous regression, enabling the capture of fluctuating and evolving suspicious intentions. We reveal that suspicion exhibits long-term dependencies and cumulative effects, similar to Temporal Point Process (TPP) theory. Based on these insights, we define a suspicion score formula that models continuous changes while accounting for temporal characteristics. We also introduce Suspicion Coefficient Modulation, which adjusts suspicion coefficients using multimodal information to reflect the varying impacts of suspicious actions. Additionally, the Concept-Anchored Mapping method is proposed to link suspicious actions to predefined intention concepts, offering insights into both the actions and their potential underlying intentions. Extensive experiments on the HAI dataset show that SPAN significantly outperforms existing methods, reducing MSE by 19.8% and improving average mAP by 1.78%. Notably, SPAN achieves a 2.74% mAP gain in low-frequency cases, demonstrating its superior ability to capture subtle behavioral changes. Compared to discrete classification systems, our continuous suspicion modeling approach enables earlier detection and proactive intervention, greatly enhancing system explainability and practical utility in security applications.
翻译:暂无翻译